Skip to content

Conversation

@dougqh
Copy link
Contributor

@dougqh dougqh commented Nov 11, 2025

What Does This Do

Aims to reduce lock contention in PendingTrace by only attempting partialFlush when a span has just been added to PendingTrace.

Prior to this change, we would also attempt a partialFlush after scope/context close as well, but closing a scope cannot cause us to cross the partialFlush threshold.

The theory is that this will improve our lock contention with virtual threads.
The concern is that virtual threads are often only restoring context, but then not creating a span.
That can lead the virtual thread to attempt a partialFlush which requires taking the PendingTrace lock.
If the PendingTrace lock cannot be acquired, then the virtual thread will be unmounted from its carrier thread.

Motivation

Report of high overhead and lock contention when using virtual threads

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@dougqh dougqh requested a review from a team as a code owner November 11, 2025 20:18
@dougqh dougqh requested a review from smola November 11, 2025 20:18
@github-actions
Copy link
Contributor

github-actions bot commented Nov 11, 2025

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@dougqh dougqh added comp: core Tracer core tag: performance Performance related changes labels Nov 11, 2025
}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, I'm curious what others think of this potential change.
I'm intending to write a microbenchmark to see if I can verify that this change is profitable.
I also think I can write a test verifies the PendingTrace behavior by using a custom writer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure if this addresses the reported issue.
However, it does cut my macrobenchmark by 2-3%. Given that my macrobenchmark uses @Trace annotations which are rather heavy, I suspect the gains might be larger with typical auto-instrumentation.

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Nov 11, 2025

🎯 Code Coverage
Patch Coverage: 100.00%
Total Coverage: 63.30% (+3.67%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 34a6e6b | Docs | Datadog PR Page | Was this helpful? Give us feedback!

Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

@mcculls mcculls added the type: enhancement Enhancements and improvements label Nov 11, 2025
@pr-commenter
Copy link

pr-commenter bot commented Nov 11, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1762881452 1762892030
git_commit_sha 5db793a 34a6e6b
release_version 1.56.0-SNAPSHOT~5db793a092 1.56.0-SNAPSHOT~34a6e6ba5a
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1762893905 1762893905
ci_job_id 1228416161 1228416161
ci_pipeline_id 81995573 81995573
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-kzzpi5o2 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-kzzpi5o2 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 54 metrics, 11 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.051 s) : 0, 1051435
Total [baseline] (8.647 s) : 0, 8647421
Agent [candidate] (1.051 s) : 0, 1051338
Total [candidate] (8.654 s) : 0, 8653531
section iast
Agent [baseline] (1.179 s) : 0, 1178910
Total [baseline] (9.242 s) : 0, 9241889
Agent [candidate] (1.184 s) : 0, 1184126
Total [candidate] (9.264 s) : 0, 9264463
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.051 s -
Agent iast 1.179 s 127.475 ms (12.1%)
Total tracing 8.647 s -
Total iast 9.242 s 594.469 ms (6.9%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.051 s -
Agent iast 1.184 s 132.788 ms (12.6%)
Total tracing 8.654 s -
Total iast 9.264 s 610.933 ms (7.1%)
gantt
    title insecure-bank - break down per module: candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.469 ms) : 0, 1469
crashtracking [candidate] (1.489 ms) : 0, 1489
BytebuddyAgent [baseline] (707.684 ms) : 0, 707684
BytebuddyAgent [candidate] (707.182 ms) : 0, 707182
GlobalTracer [baseline] (246.576 ms) : 0, 246576
GlobalTracer [candidate] (247.091 ms) : 0, 247091
AppSec [baseline] (32.456 ms) : 0, 32456
AppSec [candidate] (32.53 ms) : 0, 32530
Debugger [baseline] (6.421 ms) : 0, 6421
Debugger [candidate] (6.387 ms) : 0, 6387
Remote Config [baseline] (733.97 µs) : 0, 734
Remote Config [candidate] (716.449 µs) : 0, 716
Telemetry [baseline] (13.887 ms) : 0, 13887
Telemetry [candidate] (15.396 ms) : 0, 15396
Flare Poller [baseline] (7.368 ms) : 0, 7368
Flare Poller [candidate] (5.808 ms) : 0, 5808
section iast
crashtracking [baseline] (1.455 ms) : 0, 1455
crashtracking [candidate] (1.487 ms) : 0, 1487
BytebuddyAgent [baseline] (827.925 ms) : 0, 827925
BytebuddyAgent [candidate] (830.608 ms) : 0, 830608
GlobalTracer [baseline] (234.368 ms) : 0, 234368
GlobalTracer [candidate] (235.836 ms) : 0, 235836
IAST [baseline] (33.209 ms) : 0, 33209
IAST [candidate] (32.695 ms) : 0, 32695
AppSec [baseline] (28.068 ms) : 0, 28068
AppSec [candidate] (29.138 ms) : 0, 29138
Debugger [baseline] (6.002 ms) : 0, 6002
Debugger [candidate] (6.092 ms) : 0, 6092
Remote Config [baseline] (604.401 µs) : 0, 604
Remote Config [candidate] (611.474 µs) : 0, 611
Telemetry [baseline] (8.403 ms) : 0, 8403
Telemetry [candidate] (8.635 ms) : 0, 8635
Flare Poller [baseline] (4.105 ms) : 0, 4105
Flare Poller [candidate] (4.134 ms) : 0, 4134
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.05 s) : 0, 1049545
Total [baseline] (10.754 s) : 0, 10753829
Agent [candidate] (1.05 s) : 0, 1050413
Total [candidate] (10.811 s) : 0, 10811482
section appsec
Agent [baseline] (1.225 s) : 0, 1224572
Total [baseline] (10.893 s) : 0, 10893117
Agent [candidate] (1.232 s) : 0, 1232414
Total [candidate] (10.93 s) : 0, 10929541
section iast
Agent [baseline] (1.186 s) : 0, 1185735
Total [baseline] (11.187 s) : 0, 11186874
Agent [candidate] (1.18 s) : 0, 1180018
Total [candidate] (11.115 s) : 0, 11115021
section profiling
Agent [baseline] (1.202 s) : 0, 1201995
Total [baseline] (11.013 s) : 0, 11012999
Agent [candidate] (1.194 s) : 0, 1194461
Total [candidate] (10.884 s) : 0, 10883734
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.05 s -
Agent appsec 1.225 s 175.026 ms (16.7%)
Agent iast 1.186 s 136.189 ms (13.0%)
Agent profiling 1.202 s 152.45 ms (14.5%)
Total tracing 10.754 s -
Total appsec 10.893 s 139.289 ms (1.3%)
Total iast 11.187 s 433.045 ms (4.0%)
Total profiling 11.013 s 259.171 ms (2.4%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.05 s -
Agent appsec 1.232 s 182.001 ms (17.3%)
Agent iast 1.18 s 129.605 ms (12.3%)
Agent profiling 1.194 s 144.048 ms (13.7%)
Total tracing 10.811 s -
Total appsec 10.93 s 118.06 ms (1.1%)
Total iast 11.115 s 303.539 ms (2.8%)
Total profiling 10.884 s 72.252 ms (0.7%)
gantt
    title petclinic - break down per module: candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.459 ms) : 0, 1459
crashtracking [candidate] (1.464 ms) : 0, 1464
BytebuddyAgent [baseline] (706.218 ms) : 0, 706218
BytebuddyAgent [candidate] (707.451 ms) : 0, 707451
GlobalTracer [baseline] (246.565 ms) : 0, 246565
GlobalTracer [candidate] (246.384 ms) : 0, 246384
AppSec [baseline] (32.351 ms) : 0, 32351
AppSec [candidate] (32.408 ms) : 0, 32408
Debugger [baseline] (6.398 ms) : 0, 6398
Debugger [candidate] (6.462 ms) : 0, 6462
Remote Config [baseline] (729.641 µs) : 0, 730
Remote Config [candidate] (707.913 µs) : 0, 708
Telemetry [baseline] (12.97 ms) : 0, 12970
Telemetry [candidate] (14.98 ms) : 0, 14980
Flare Poller [baseline] (8.063 ms) : 0, 8063
Flare Poller [candidate] (5.725 ms) : 0, 5725
section appsec
crashtracking [baseline] (1.461 ms) : 0, 1461
crashtracking [candidate] (1.48 ms) : 0, 1480
BytebuddyAgent [baseline] (730.872 ms) : 0, 730872
BytebuddyAgent [candidate] (735.7 ms) : 0, 735700
GlobalTracer [baseline] (238.227 ms) : 0, 238227
GlobalTracer [candidate] (239.942 ms) : 0, 239942
AppSec [baseline] (175.019 ms) : 0, 175019
AppSec [candidate] (175.572 ms) : 0, 175572
Debugger [baseline] (5.998 ms) : 0, 5998
Debugger [candidate] (6.088 ms) : 0, 6088
Remote Config [baseline] (666.004 µs) : 0, 666
Remote Config [candidate] (658.835 µs) : 0, 659
Telemetry [baseline] (8.48 ms) : 0, 8480
Telemetry [candidate] (8.64 ms) : 0, 8640
Flare Poller [baseline] (3.989 ms) : 0, 3989
Flare Poller [candidate] (4.081 ms) : 0, 4081
IAST [baseline] (24.911 ms) : 0, 24911
IAST [candidate] (25.111 ms) : 0, 25111
section iast
crashtracking [baseline] (1.467 ms) : 0, 1467
crashtracking [candidate] (1.454 ms) : 0, 1454
BytebuddyAgent [baseline] (832.899 ms) : 0, 832899
BytebuddyAgent [candidate] (828.225 ms) : 0, 828225
GlobalTracer [baseline] (235.485 ms) : 0, 235485
GlobalTracer [candidate] (234.989 ms) : 0, 234989
AppSec [baseline] (29.042 ms) : 0, 29042
AppSec [candidate] (29.582 ms) : 0, 29582
Debugger [baseline] (6.058 ms) : 0, 6058
Debugger [candidate] (5.987 ms) : 0, 5987
Remote Config [baseline] (604.672 µs) : 0, 605
Remote Config [candidate] (601.47 µs) : 0, 601
Telemetry [baseline] (8.506 ms) : 0, 8506
Telemetry [candidate] (8.431 ms) : 0, 8431
Flare Poller [baseline] (4.177 ms) : 0, 4177
Flare Poller [candidate] (4.058 ms) : 0, 4058
IAST [baseline] (32.55 ms) : 0, 32550
IAST [candidate] (31.847 ms) : 0, 31847
section profiling
ProfilingAgent [baseline] (112.969 ms) : 0, 112969
ProfilingAgent [candidate] (110.715 ms) : 0, 110715
crashtracking [baseline] (1.44 ms) : 0, 1440
crashtracking [candidate] (1.454 ms) : 0, 1454
BytebuddyAgent [baseline] (733.27 ms) : 0, 733270
BytebuddyAgent [candidate] (730.516 ms) : 0, 730516
GlobalTracer [baseline] (223.55 ms) : 0, 223550
GlobalTracer [candidate] (222.256 ms) : 0, 222256
AppSec [baseline] (32.8 ms) : 0, 32800
AppSec [candidate] (32.014 ms) : 0, 32014
Debugger [baseline] (8.41 ms) : 0, 8410
Debugger [candidate] (9.064 ms) : 0, 9064
Remote Config [baseline] (721.739 µs) : 0, 722
Remote Config [candidate] (682.254 µs) : 0, 682
Telemetry [baseline] (14.709 ms) : 0, 14709
Telemetry [candidate] (13.88 ms) : 0, 13880
Flare Poller [baseline] (4.181 ms) : 0, 4181
Flare Poller [candidate] (4.087 ms) : 0, 4087
Profiling [baseline] (113.65 ms) : 0, 113650
Profiling [candidate] (111.383 ms) : 0, 111383
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1762881452 1762892030
git_commit_sha 5db793a 34a6e6b
release_version 1.56.0-SNAPSHOT~5db793a092 1.56.0-SNAPSHOT~34a6e6ba5a
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1762894395 1762894395
ci_job_id 1228416162 1228416162
ci_pipeline_id 81995573 81995573
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-rnhchtr9 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-rnhchtr9 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 5 performance improvements and 2 performance regressions! Performance is the same for 13 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:no_agent:high_load worse
[+1.866ms; +3.049ms] or [+11.410%; +18.644%]
worse
[+2.612ms; +4.691ms] or [+9.548%; +17.144%]
unstable
[-62.277op/s; -2.536op/s] or [-22.585%; -0.920%]
18.814ms 31.011ms 243.344op/s 16.356ms 27.360ms 275.750op/s
scenario:load:petclinic:profiling:high_load better
[-1.496ms; -0.506ms] or [-7.855%; -2.656%]
same
[-1419.581µs; +632.271µs] or [-4.665%; +2.078%]
unstable
[-6.040op/s; +41.819op/s] or [-2.501%; +17.312%]
18.048ms 30.036ms 259.452op/s 19.050ms 30.430ms 241.562op/s
scenario:load:petclinic:tracing:high_load better
[-2.571ms; -1.010ms] or [-13.741%; -5.399%]
better
[-4.113ms; -1.218ms] or [-13.605%; -4.027%]
unstable
[-11.594op/s; +47.782op/s] or [-4.692%; +19.338%]
16.917ms 27.569ms 265.188op/s 18.707ms 30.235ms 247.094op/s
scenario:load:petclinic:code_origins:high_load better
[-1.454ms; -0.759ms] or [-7.962%; -4.155%]
better
[-1.546ms; -0.669ms] or [-5.245%; -2.269%]
unstable
[-1.655op/s; +47.831op/s] or [-0.658%; +19.002%]
17.155ms 28.372ms 274.806op/s 18.262ms 29.479ms 251.719op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.18 ms) : 1168, 1191
.   : milestone, 1180,
iast (3.253 ms) : 3206, 3300
.   : milestone, 3253,
iast_FULL (5.751 ms) : 5693, 5808
.   : milestone, 5751,
iast_GLOBAL (3.566 ms) : 3517, 3616
.   : milestone, 3566,
profiling (2.066 ms) : 2045, 2086
.   : milestone, 2066,
tracing (1.853 ms) : 1836, 1869
.   : milestone, 1853,
section candidate
no_agent (1.176 ms) : 1164, 1187
.   : milestone, 1176,
iast (3.159 ms) : 3122, 3196
.   : milestone, 3159,
iast_FULL (5.763 ms) : 5706, 5820
.   : milestone, 5763,
iast_GLOBAL (3.511 ms) : 3462, 3559
.   : milestone, 3511,
profiling (2.005 ms) : 1986, 2025
.   : milestone, 2005,
tracing (1.789 ms) : 1774, 1804
.   : milestone, 1789,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.18 ms [1.168 ms, 1.191 ms] -
iast 3.253 ms [3.206 ms, 3.3 ms] 2.073 ms (175.7%)
iast_FULL 5.751 ms [5.693 ms, 5.808 ms] 4.571 ms (387.4%)
iast_GLOBAL 3.566 ms [3.517 ms, 3.616 ms] 2.386 ms (202.2%)
profiling 2.066 ms [2.045 ms, 2.086 ms] 885.759 µs (75.1%)
tracing 1.853 ms [1.836 ms, 1.869 ms] 672.645 µs (57.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.176 ms [1.164 ms, 1.187 ms] -
iast 3.159 ms [3.122 ms, 3.196 ms] 1.983 ms (168.6%)
iast_FULL 5.763 ms [5.706 ms, 5.82 ms] 4.587 ms (390.1%)
iast_GLOBAL 3.511 ms [3.462 ms, 3.559 ms] 2.335 ms (198.5%)
profiling 2.005 ms [1.986 ms, 2.025 ms] 829.468 µs (70.5%)
tracing 1.789 ms [1.774 ms, 1.804 ms] 613.057 µs (52.1%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092
    dateFormat X
    axisFormat %s
section baseline
no_agent (16.92 ms) : 16754, 17085
.   : milestone, 16920,
appsec (18.608 ms) : 18420, 18796
.   : milestone, 18608,
code_origins (18.543 ms) : 18357, 18728
.   : milestone, 18543,
iast (17.773 ms) : 17598, 17949
.   : milestone, 17773,
profiling (19.325 ms) : 19131, 19520
.   : milestone, 19325,
tracing (18.894 ms) : 18700, 19089
.   : milestone, 18894,
section candidate
no_agent (19.187 ms) : 18995, 19380
.   : milestone, 19187,
appsec (18.479 ms) : 18293, 18666
.   : milestone, 18479,
code_origins (17.527 ms) : 17351, 17703
.   : milestone, 17527,
iast (17.95 ms) : 17772, 18128
.   : milestone, 17950,
profiling (18.571 ms) : 18382, 18760
.   : milestone, 18571,
tracing (17.593 ms) : 17418, 17769
.   : milestone, 17593,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 16.92 ms [16.754 ms, 17.085 ms] -
appsec 18.608 ms [18.42 ms, 18.796 ms] 1.688 ms (10.0%)
code_origins 18.543 ms [18.357 ms, 18.728 ms] 1.623 ms (9.6%)
iast 17.773 ms [17.598 ms, 17.949 ms] 853.555 µs (5.0%)
profiling 19.325 ms [19.131 ms, 19.52 ms] 2.406 ms (14.2%)
tracing 18.894 ms [18.7 ms, 19.089 ms] 1.974 ms (11.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.187 ms [18.995 ms, 19.38 ms] -
appsec 18.479 ms [18.293 ms, 18.666 ms] -708.218 µs (-3.7%)
code_origins 17.527 ms [17.351 ms, 17.703 ms] -1.66 ms (-8.7%)
iast 17.95 ms [17.772 ms, 18.128 ms] -1.238 ms (-6.5%)
profiling 18.571 ms [18.382 ms, 18.76 ms] -616.13 µs (-3.2%)
tracing 17.593 ms [17.418 ms, 17.769 ms] -1.594 ms (-8.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/pending-trace-contention-reduction
git_commit_date 1762881452 1762892030
git_commit_sha 5db793a 34a6e6b
release_version 1.56.0-SNAPSHOT~5db793a092 1.56.0-SNAPSHOT~34a6e6ba5a
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1762894047 1762894047
ci_job_id 1228416163 1228416163
ci_pipeline_id 81995573 81995573
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-gb43nyma 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-gb43nyma 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.442 s) : 15442000, 15442000
.   : milestone, 15442000,
appsec (14.747 s) : 14747000, 14747000
.   : milestone, 14747000,
iast (18.96 s) : 18960000, 18960000
.   : milestone, 18960000,
iast_GLOBAL (18.127 s) : 18127000, 18127000
.   : milestone, 18127000,
profiling (15.542 s) : 15542000, 15542000
.   : milestone, 15542000,
tracing (14.833 s) : 14833000, 14833000
.   : milestone, 14833000,
section candidate
no_agent (14.924 s) : 14924000, 14924000
.   : milestone, 14924000,
appsec (15.132 s) : 15132000, 15132000
.   : milestone, 15132000,
iast (18.339 s) : 18339000, 18339000
.   : milestone, 18339000,
iast_GLOBAL (17.876 s) : 17876000, 17876000
.   : milestone, 17876000,
profiling (14.856 s) : 14856000, 14856000
.   : milestone, 14856000,
tracing (14.673 s) : 14673000, 14673000
.   : milestone, 14673000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.442 s [15.442 s, 15.442 s] -
appsec 14.747 s [14.747 s, 14.747 s] -695.0 ms (-4.5%)
iast 18.96 s [18.96 s, 18.96 s] 3.518 s (22.8%)
iast_GLOBAL 18.127 s [18.127 s, 18.127 s] 2.685 s (17.4%)
profiling 15.542 s [15.542 s, 15.542 s] 100.0 ms (0.6%)
tracing 14.833 s [14.833 s, 14.833 s] -609.0 ms (-3.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.924 s [14.924 s, 14.924 s] -
appsec 15.132 s [15.132 s, 15.132 s] 208.0 ms (1.4%)
iast 18.339 s [18.339 s, 18.339 s] 3.415 s (22.9%)
iast_GLOBAL 17.876 s [17.876 s, 17.876 s] 2.952 s (19.8%)
profiling 14.856 s [14.856 s, 14.856 s] -68.0 ms (-0.5%)
tracing 14.673 s [14.673 s, 14.673 s] -251.0 ms (-1.7%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~34a6e6ba5a, baseline=1.56.0-SNAPSHOT~5db793a092
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.472 ms) : 1461, 1484
.   : milestone, 1472,
appsec (3.652 ms) : 3439, 3866
.   : milestone, 3652,
iast (2.212 ms) : 2148, 2275
.   : milestone, 2212,
iast_GLOBAL (2.253 ms) : 2189, 2317
.   : milestone, 2253,
profiling (2.067 ms) : 2015, 2119
.   : milestone, 2067,
tracing (2.025 ms) : 1976, 2075
.   : milestone, 2025,
section candidate
no_agent (1.47 ms) : 1458, 1481
.   : milestone, 1470,
appsec (3.699 ms) : 3479, 3919
.   : milestone, 3699,
iast (2.201 ms) : 2138, 2264
.   : milestone, 2201,
iast_GLOBAL (2.257 ms) : 2193, 2321
.   : milestone, 2257,
profiling (2.048 ms) : 1997, 2100
.   : milestone, 2048,
tracing (2.012 ms) : 1963, 2061
.   : milestone, 2012,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.461 ms, 1.484 ms] -
appsec 3.652 ms [3.439 ms, 3.866 ms] 2.18 ms (148.1%)
iast 2.212 ms [2.148 ms, 2.275 ms] 739.234 µs (50.2%)
iast_GLOBAL 2.253 ms [2.189 ms, 2.317 ms] 781.039 µs (53.0%)
profiling 2.067 ms [2.015 ms, 2.119 ms] 594.735 µs (40.4%)
tracing 2.025 ms [1.976 ms, 2.075 ms] 553.121 µs (37.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.47 ms [1.458 ms, 1.481 ms] -
appsec 3.699 ms [3.479 ms, 3.919 ms] 2.229 ms (151.7%)
iast 2.201 ms [2.138 ms, 2.264 ms] 731.031 µs (49.7%)
iast_GLOBAL 2.257 ms [2.193 ms, 2.321 ms] 787.473 µs (53.6%)
profiling 2.048 ms [1.997 ms, 2.1 ms] 578.842 µs (39.4%)
tracing 2.012 ms [1.963 ms, 2.061 ms] 542.367 µs (36.9%)

}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Maybe rename addedSpan for allowPartialWrite

}

private PublishState decrementRefAndMaybeWrite(boolean isRootSpan) {
private PublishState decrementRefAndMaybeWrite(boolean isRootSpan, boolean addedSpan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me like clever trick. This changes a bit the write dynamic, where the next chance to write is when a new span is added, or when the root span is finished (and the other queueing states). I believe this is good. I've seen some instrumentations like aerospike that explicitly cancel the "continuation", but I don't think this is an issue.

Comment on lines +289 to +297
// DQH - We only trigger a partial flush, when a span has just been added
// This prevents a bunch of threads which are only performing scope/context operations
// from all fighting to perform the partialFlush after the threshold is crossed.

// This is an important optimization for virtual threads where a continuation might
// be created even though no span is created. In that situation, virtual threads
// can end up fighting to perform the partialFlush. And even trying to perform a
// partialFlush requires taking the PendingTrace lock which can lead to unmounting
// the virtual thread from its carrier thread.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know whole picture, but just from my experience, what if we should not fight for flush at all?
Maybe we can refactor logic that all threads that interested in flushing would just set some flag to true and some background tread would check it and periodically flush data?
Does it make sense at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of boolean flag, counter would be better solution to implement. Also it would be useful info to know how many times flush was requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm inclined to agree. This was mostly intended as a quick fix / experiment to see if we could improve the reported issue. In my macrobenchmark, I did see a 2% throughput improvement but haven't yet replicated the stall that was reported.

I do like the idea of flipping a boolean. I also don't like that we're taking a long held lock in the application critical path, so there's definitely still a lot of room for improvement.

@dougqh
Copy link
Contributor Author

dougqh commented Nov 13, 2025

I think it's a good optimization.

Looking forward to seeing the microbenchmark results, I suspect it will show a positive improvement when there are a lot of context migrations.

I did a quick stand-alone macrobenchmark. The macrobenchmark shows a modest but consistent 2% reduction in execution time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: performance Performance related changes type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants