How do you integrate custom tracing with Cassandra?

Back on day I had a brilliant idea – to make the world a more observable place. Since I couldn’t observe the inner workings of say PostgreSQL, I’ve decided to hook up to Jaeger whatever I can. And somebody had just given me the leg work needed to reach this place. Thank you infracloudio/cassandra-jaeger-tracing. Of course the code was a mess, somewhat disorganized and completely soiling Jaeger traces, but OK. I’ve whipped it into shape. By doing so, I’ve learned how the tracing works in Cassandra 3.0.

How tracing worked back in Cassandra 3.0?

Tracing was provided by a load-in Java class implementing the org.apache.cassandra.tracing.Tracing base class. It was your job to ferry observability tokens across requests. changes in state arrived post factum, so you were to be ever vigilant to old states that have long expired. I made span closer, that closed any old spans that have not reported progress in at least 30 seconds.

I must admit that I learned most of tracing’s mechanisms through trial and error.

Let me phrase it succintly in some comments for the code that I wrote:


/**
 * A single instance of this is created for entire Cassandra, so we need to use thread-local
 * storage.
 * <p>
 * Now, there are two possibilities. Either we are the coordinator, in which case Cassandra will call:
 * 1. newSession(...) by Native-Transport-Request
 * 2. newTraceState(...) is called by newSession(...)
 * 3. begin(...)
 * 4. trace(...)
 * 5. stopSessionImpl(...)
 * <p>
 * or we are a replica, responding to a coordinator, in which case it will look more like:
 * <p>
 * 1. initializeFromMessage(...) by MessagingService
 * 2. newTraceState(...) is called by initializeFromMessage(...)
 * 2. trace(...)
 * 3. ...
 * 4. Nothing. Dead silence. Cassandra doesn't tell us when such a session has finished!
 * <p>
 * So we'll spawn a thread (CloserThread) to close them for us automatically.
 * <p>
 * So in general, in newSession/initializeFromMessage we prepare the builder, then in
 * newTrace start the span, and hope for the best.
 */

How tracing is done nowadays

Ok. The comments in the code are outright lies. There’s only one instance of your Tracing made to trace every event that this Cassandra coordinator has deemeded necessary to trace (whether the same thread always calls it, that is the true question). So at the very least, you’ll need a HashMap<TimeUUID, TraceState>. Oh, sorry, you inherit a protected final ConcurrentMap sessions = new ConcurrentHashMap<>();

Let me play a detective for a while. O’ve figured out that certain signatures for tracing are always called by particular places:

TimeUUID newSession(Map<String, ByteBuffer> customPayload) is called upon receiving any request
TimeUUID newSession(TraceType traceType) is called by RepairRunnable and PAXOS. Interesting….
TimeUUID newSession(TimeUUID sessionId, Map<String, ByteBuffer> customPayload) is called by these edge cases where Cassandra decides unilaterally that it’s time to launch a trace

I’d say that customPayload is quite useful in telling us whether our requestor wants us to actually follow the trace. I’ve decided to help myself out creating a small wrapper class called StandardTextMap (which hopefully implements OpenTracing TextMap). so it’s both easily injectable and ejectable.

Of course you job isn’t just to register a new TraceState. You have to return it a uniquely identifying it TimeUUID.

Hey, I lied before! You do not need to overload all these newSessions, just a simple mandatory overload of all of these detailed below functions:

protected abstract void stopSessionImpl(); – just makes your internally traced session “end”, by whatever means are necessary. No biggie.
public abstract TraceState begin(String request, InetAddress client, Map<String, String> parameters); – a new trace is commencing
protected abstract TraceState newTraceState(InetAddressAndPort coordinator, TimeUUID sessionId, Tracing.TraceType traceType); – means your’s merohd.htrace has enterede a new stage
public abstract void trace(ByteBuffer sessionId, String message, int ttl); – purely an afterthought for traces that are not coordinated by the local node

Note that TraceStates are something spawned per trace crossed with a remembering up to 10 sections identified by their TimeUUIDs (which can be in a pinch replaced by standard java.util.UUID) (not that anyone complained). However, Tracing objects are spawned without any constructor to aid you, per a single tracing session, that’s a major breakthrough from the Cassandra 3.0 option.

One thing that they botched in Cassandra 4.0 is ferrying OpenTelemetry spans across nodes – when this is all that you’ve got to send over your spans, this doesn’t look too promising. I’m in the middle of getting a reply from the more seasoned pros, so I’ll hack something in the meantime, I promise.

BTW the Cassandra folks have seen to done a reasonably job at manually closing the coordinator-started spans. I’ve yet to check after I resolve this problem, but my earlier ClosingThread that closed all the traces that Cassandra didn’t report after was gone for good.

And last but not least, please note tha corresponding message of Jaeger working more or less beautifully with my 3-node Cassandra cluster:

How do you integrate custom tracing with Cassandra?

How tracing worked back in Cassandra 3.0?

How tracing is done nowadays

By Piotr Maślanka

Leave a comment Cancel reply