Skip to content

Conversation

@lbeschastny
Copy link

@lbeschastny lbeschastny commented Oct 23, 2025

Propagators could leave spans in a deferred state. For example, the following valid set of b3-multi headers does eactly that:

X-B3-TraceId: 68f8e1942794b9a8072bfc5f4f2c368f
X-B3-SpanId: a4295738dcb1aac4

but ShouldSample implementation for opentelemetry_sdk::trace::sampler::Sampler only checks that the parent span is active.

The correct behaviour would be to fall back to delegate sampler when the parent spans has deferred flag set by a propagator.

Changes

Sampler::ParentBased branch in the should_sample method now checks is the parent span has TRACE_FLAG_DEFERRED set and filters all deferred spans.

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@lbeschastny lbeschastny requested a review from a team as a code owner October 23, 2025 10:09
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 23, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: lbeschastny / name: Leonid Beschasny (642d203)

@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.8%. Comparing base (cff5728) to head (642d203).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #3209   +/-   ##
=====================================
  Coverage   80.8%   80.8%           
=====================================
  Files        128     128           
  Lines      23288   23316   +28     
=====================================
+ Hits       18821   18849   +28     
  Misses      4467    4467           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lbeschastny lbeschastny force-pushed the fix/psrent-based-sampler-with-deferred-trace branch from 97b5b50 to cb7930d Compare October 23, 2025 12:58
#[cfg(feature = "jaeger_remote_sampler")]
use opentelemetry_http::HttpClient;

const TRACE_FLAG_DEFERRED: TraceFlags = TraceFlags::new(0x02);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://w3c.github.io/trace-context/#trace-flags
I don't see such a flag definition in the official spec...

Copy link
Author

@lbeschastny lbeschastny Nov 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's not an official flag. It's a hack used across opentelemetry-rust project to mark spans which lack sampling information to distinguish between unsampled and not-yet-sampled ones.

Here are some examples:

opentelemetry-rust-contrib propagators are using the same approach as well, i.e.:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would've been better to provide a way to apply sampling inside a propagator when extracting tracing context.
But that's a separate thing that would require a major release.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to merge my fix now and then see how this TRACE_FLAG_DEFERRED solution could be refactored for the next major release

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unfortunate that the B3 code is using TraceFlags to send over this information between Extractor and Injector. If your application is set up to accept B3 headers and inject both B3 and w3c trace context headers, you will end up with illegal trace flags in the w3c headers. Also, the sampling decision made here will not propagate correctly to the outgoing B3 headers since they look at the deferred flag first, and skip the sampling decision.

To support this use case, I think it would be better to define a separate struct to hold this information (and not B3 specific) in the API or SDK that is added as to the Context and then read by the Sampler. The B3 code would need to be updated as well.

Copy link
Author

@lbeschastny lbeschastny Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I testes this patch and it works correctly including context propagation.

Propagation works correctly because existing propagators clear all flags except for TraceFlags::SAMPLED right now, i.e.:
https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/propagation/trace_context.rs#L136

I agree that going forward it would be best to store this information separately instead of using trace flags.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's not only B3. I actually encountered this issue when using xray propagator from a third party opentelemetry_aws crate. It looks like they copy-pasted the hack from zipkin propagator.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incoming flags are also cleared of everything that is not TraceFlags::SAMPLED:
https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/propagation/trace_context.rs#L101-L103

So, TRACE_FLAG_DEFERRED can only exist internally and can never leak out. At least not with the current solution.

Copy link
Author

@lbeschastny lbeschastny Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the sampling decision made here will not propagate correctly to the outgoing B3 headers since they look at the deferred flag first, and skip the sampling decision.

Hm... you may be right. I should check that. So far I only verified that w3c headers are generated correctly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, x-b3-sampled is not set correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants