Skip to content

Conversation

@SiyaoIsHiding
Copy link
Contributor

No description provided.

default Statement<?> getDecoratedStatement(
@NonNull Statement<?> statement, @NonNull String requestId) {
// in case of retry or speculative execution
if (statement.getCustomPayload().containsKey(getCustomPayloadKey())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not override the value instead of early-exiting? traceparent key will contain node request ID and IMHO it should be unique per attempt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retries or speculative execution attempts are ideally supposed to be the same (unmodified) content that went out the first time... kinda makes sense given the role those two features play in the driver. With that in mind I assume you'd want to have the same ID here so that the retry or speculative execution attempt could be more easily identified as what they are and not mistaken for other follow-up requests from the driver.

@SiyaoIsHiding can confirm if that's what she was after here or not but that seems right to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree, I would want to retain the same request ID as the fact that the query was speculated/retried is useful context when tracing a request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should override instead of using the same ID for the node requests under the same session request.

TL;DR: In all of our ID generator implementations, the node request ID already contains the session request ID in it.

So let's say the session request ID is "4bf92f3577b34da6a3ce929d0e0e4736", if we search nodeRequestID.contains("4bf92f3577b34da6a3ce929d0e0e4736"), we will get all the retry/speculative executions for this session request, e.g. "00-4bf92f3577b34da6a3ce929d0e0e4736-a3ce929d0e0e4736-01" or 00-4bf92f3577b34da6a3ce929d0e0e4736-0e4736a3ce929d0e-01"

This aligns with W3C context , that the trace ID is the same across the transasction, but span ID should be always unique.

This is also documented in our implementations.

/**
* {session-request-id}-{random-uuid} All node requests for a session request will have the same
* session request id
*/
@Override
public String getNodeRequestId(@NonNull Request statement, @NonNull String parentId) {

/**
* Following the format of W3C "traceparent" spec,
* https://www.w3.org/TR/trace-context/#traceparent-header-field-values e.g.
* "00-4bf92f3577b34da6a3ce929d0e0e4736-a3ce929d0e0e4736-01" All node requests in the same session
* request share the same "trace-id" field value
*/
@Override
public String getNodeRequestId(@NonNull Request statement, @NonNull String parentId) {

Good catch from @lukasz-antoniak 🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something but as I understand it the question isn't whether node request ID contains session request ID or not. The question is whether we should update the node request ID (regardless of it's structure) when we need to send retries or speculative executions. In other contexts we go to considerable lengths to make sure we're sending the exact same messages we sent earlier... so it seems a bit weird not to do so here.

The W3C docs linked above indicate that trace-id SHOULD be unique but they don't require it. Furthermore they're talking about uniqueness across individual requests; this spec is silent on things like retrying a request. And at least some sources I'm finding seem to indicate that the same trace-id should be used if a transaction is retried.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value in the custom payload, what we call "request-id", is not the trace ID, it's the "traceparent" for context propagation.

Driver concept W3C concept Example
Request ID value in the custom payload traceparent (for context propagation) 00-4bf92f3577b34da6a3ce929d0e0e4736-a3ce929d0e0e4736-01
Session request ID trace ID 4bf92f3577b34da6a3ce929d0e0e4736
The 8 bytes that we add to a session request ID to form the node request ID span ID a3ce929d0e0e4736

And the traceparent context propagation format is:
version "-" trace-id "-" span-id "-" trace-flags

Trace ID needs to be the same across all spans in a transaction.

Span ID is always unique. Otherwise, let's say if the driver sends out 2 requests, including a retry, both with the same node request ID. Then on the server side, you see two identical requests, how can you tell which one is the first, which one is the retry?

Span ID is an identifier that always need to be unique to identify each individual operation. Trace ID is how we correlate retries and speculative executions under one session request. The value in the custom payload needs both, so that we can both correlate and separate different operations in one transaction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To close the loop on this: the level 2 trace-context spec is pretty clear on this point:

The value of span-id SHOULD be unique within a distributed trace. If the value of span-id is not unique within a distributed trace, parent-child relationships between spans within the distributed trace may be ambiguous.

Map<String, ByteBuffer> existing = statement.getCustomPayload();
String key = getCustomPayloadKey();

NullAllowingImmutableMap.Builder<String, ByteBuffer> builder =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, at this point what value is NullAllowingImmutableMap providing?

Since it is what has the duplicate key restriction, and it's main value is allowing null keys and values, but we would never set a null key/value in this context (correct me if I'm wrong?).

As far as I understand it, it's just ImmutableMap that has the null key/value restriction. Why wouldn't we just use a regular Map implementation and wrap it in Collections.unmodifiableMap?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some spelunking and it looks like @emerkle826 made that mod way back in the day. I've already pinged him demanding that he explain himself asking very politely if we could provide the relevant context 🧌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: what map to use.

It seems like the protocol allows null values in the custom payload (source).

So the custom payload provided by the user can already have null values in it, and we need to support that.

But a normal ImmutableMap does not allow a null value. I'm guessing that's why we were using a NullAllowingImmutableMap. Pending for @olim7t and @emerkle826 to confirm.

However, Collections.unmodifiableMap is also immutable and allows null values. I cannot quite see why we don't use that.

The code here will be cleaner if we use Collections.unmodifiableMap. If no objections, I think we can go that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just chiming in that when I added the NullAllowingImmutableMap, I believe I did it simply because it was the Map type we were using around changes made in Native Protocol related to #1087. I don't recall any comments or real reasons around choosing that Map implementation as opposed to some other immutable map that allowed nulls other than we were already using it in native protocol so why not. :)

Copy link
Contributor

@tolbertam tolbertam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this looks like it should address the issue to me!

try (CqlSession session = SessionUtils.newSession(ccmRule, loader)) {
String query = "SELECT * FROM system.local";
ResultSet rs = session.execute(query);
assertThat(rs.getExecutionInfo().getRequest().getCustomPayload().get("trace_key")).isNull();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should check here for absence of request-id (default), not trace_key (copied from upper test).

SimpleStatement.newInstance(query).setCustomPayload(customPayload);
assertThatStage(session.executeAsync(statement)).isSuccess();
ResultSet rs = session.execute(query);
assertThat(rs.getExecutionInfo().getRequest().getCustomPayload().get("trace_key")).isNull();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this assertion is not needed in this test.

Copy link
Member

@lukasz-antoniak lukasz-antoniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two NIT comments. PR looks good!

@tolbertam
Copy link
Contributor

tolbertam commented Nov 7, 2025

@SiyaoIsHiding looks like we got 2 +1s on the review 🎉

Mind implementing Lukasz's suggestions and then squashing into one commit and adding:

"patch by Jane He; reviewed by Andy Tolbert and Lukasz Atoniak for CASSJAVA-116" to the last line of the commit message?


Map<String, ByteBuffer> unmodifiableMap = Collections.unmodifiableMap(existing);

return statement.setCustomPayload(unmodifiableMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that I want to hold things up any but I'd argue ImmutableMap is more concise (and avoids the need to create at least one of the intermediate maps here):

  default Statement<?> getDecoratedStatement(
      @NonNull Statement<?> statement, @NonNull String requestId) {

      return statement.setCustomPayload(
              ImmutableMap.<String,ByteBuffer>builder()
              .putAll(statement.getCustomPayload())
              .put(getCustomPayloadKey(), ByteBuffer.wrap(requestId.getBytes(StandardCharsets.UTF_8)))
              .buildKeepingLast());
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original map implementation that was used that caused this issue NullAllowingImmutableMap was presumably being used to allow null keys/values, which ImmutableMap doesn't allow, so if we use that it's possible any existing custom payload map with null keys and values will cause an exception to be thrown here (discussion here: #2060 (comment))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna make some notes here 'cause in all honesty this was pretty perplexing to me... but I think I finally got to something resembling an answer. I don't know how useful all of this will be but brain dump follows.

The doc referenced by @tolbertam above is from the 3.6 Java driver, which was an entirely different code base. The constant Statement.NULL_PAYLOAD_VALUE is indeed present in that version... so far so good. Thing is... there's no corresponding entry in the 4.x manual and the equivalent 4.x class doesn't have such a constant. So... what's going on here?

Let's take a look at what the v4 native protocol spec actually says here. A bytes map is understood to have a string for a key and bytes for a map. A string is understood to be "A [short] n, followed by n bytes representing an UTF-8 string". There's no obvious way for that to be null, which is consistent with the verbiage in the 3.6 manual page. bytes on the other hand is defined as follows: "A [int] n, followed by n bytes if n >= 0. If n < 0, no byte should follow and the value represented is null" That's presumably the point of the marker referenced in the 3.6 manual page.

To confirm all of this I ran a simple test client against C* 5.0.6 using Java driver 4.19.1. In each case a custom payload was set on a simple statement and executed. Results seem to follow the pattern described above: when using a null key I get a weird native-protocol exception (an issue has been created for that elsewhere) while a null value seems to work just fine.

So where does this leave us? In order to leverage ImmutableMap you need to translate null values into zero-length ByteBuffers... but do so without acquiring intermediate state. You can do it with something like the following:

public class KeyspaceCount {

    static class ImmutableEntry<K,V> implements Map.Entry<K,V> {

        private final K key;
        private final V val;

        public ImmutableEntry(K key, V val) {
            this.key = key;
            this.val = val;
        }

        @Override
        public K getKey() {
            return key;
        }

        @Override
        public V getValue() {
            return val;
        }

        @Override
        public V setValue(Object value) {
            throw new UnsupportedOperationException("You can't do that");
        }
    }

    public static void main(String[] args) {

        try (CqlSession session = CqlSession.builder().build()) {

            SimpleStatement stmt = SimpleStatement.builder("select release_version from system.local").build();
            HashMap<String, ByteBuffer> map = new HashMap<>();
            map.put("key", ByteBuffer.allocate(0));

            ImmutableMap.Builder<String,ByteBuffer> builder = ImmutableMap.builder();
            map.entrySet()
                    .stream()
                    .map((Map.Entry<String,ByteBuffer> entry) -> {
                        return entry.getValue() != null ?
                                entry :
                                new ImmutableEntry<>(entry.getKey(), ByteBuffer.allocate(0));
                    })
                    .forEach(builder::put);

            ResultSet rs = session.execute(stmt.setCustomPayload(builder.buildKeepingLast()));
            Row row = rs.one();

            assert row != null;
            String releaseVersion = row.getString("release_version");
            System.out.printf("Cassandra version is: %s%n", releaseVersion);
        }
    }
}

That actually does work, and it's all stream-based so that the only map you create is the one you're actually add to the Statement object. You could pull the immutable Map.Entry impl into a utility class somewhere to help keep the change down but in the end yeah, I agree it's not quite as clean as I'd like. I'd still argue it's an improvement because (a) you avoid the intermediate maps and (b) you get the immutable part reflected in the type system but these maps are pretty short-lived anyway... so it probably doesn't matter that much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants