-
Notifications
You must be signed in to change notification settings - Fork 468
chore(profiling): fix flaky test_upload_resets_profile test
#15178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(profiling): fix flaky test_upload_resets_profile test
#15178
Conversation
|
|
80ead8a to
e91f6b3
Compare
297db08 to
006d5a3
Compare
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 241 ± 4 ms. The average import time from base is: 244 ± 4 ms. The import time difference between this PR and base is: -2.9 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate vlad/lockprof-flaky-test_upload_resets_profile (2105aad) with baseline main (548d46a) 🟡 Near SLO Breach (6 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 19.263ms (SLO: <22.300ms 📉 -13.6%) vs baseline: +0.1% Memory: ✅ 65.964MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +4.9% ✅ exception-replay-enabledTime: ✅ 1.336ms (SLO: <1.450ms -7.9%) vs baseline: -0.3% Memory: ✅ 64.114MB (SLO: <67.000MB -4.3%) vs baseline: +5.0% ✅ iastTime: ✅ 19.254ms (SLO: <22.250ms 📉 -13.5%) vs baseline: ~same Memory: ✅ 65.932MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.9% ✅ profilerTime: ✅ 15.388ms (SLO: <16.550ms -7.0%) vs baseline: +0.2% Memory: ✅ 53.798MB (SLO: <54.500MB 🟡 -1.3%) vs baseline: +4.7% ✅ resource-renamingTime: ✅ 19.335ms (SLO: <21.750ms 📉 -11.1%) vs baseline: -0.2% Memory: ✅ 65.968MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +5.0% ✅ span-code-originTime: ✅ 22.769ms (SLO: <28.200ms 📉 -19.3%) vs baseline: -0.6% Memory: ✅ 67.357MB (SLO: <69.500MB -3.1%) vs baseline: +5.1% ✅ tracerTime: ✅ 19.216ms (SLO: <21.750ms 📉 -11.6%) vs baseline: -0.3% Memory: ✅ 65.901MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-and-profilerTime: ✅ 21.128ms (SLO: <23.500ms 📉 -10.1%) vs baseline: -0.6% Memory: ✅ 67.898MB (SLO: <68.000MB 🟡 -0.2%) vs baseline: +5.3% ✅ tracer-dont-create-db-spansTime: ✅ 19.202ms (SLO: <21.500ms 📉 -10.7%) vs baseline: -0.2% Memory: ✅ 65.966MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +4.8% ✅ tracer-minimalTime: ✅ 16.636ms (SLO: <17.500ms -4.9%) vs baseline: ~same Memory: ✅ 65.935MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 19.303ms (SLO: <21.750ms 📉 -11.2%) vs baseline: +0.6% Memory: ✅ 67.555MB (SLO: <72.500MB -6.8%) vs baseline: +4.9% ✅ tracer-no-cachesTime: ✅ 17.266ms (SLO: <19.650ms 📉 -12.1%) vs baseline: ~same Memory: ✅ 65.976MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +5.0% ✅ tracer-no-databasesTime: ✅ 18.738ms (SLO: <20.100ms -6.8%) vs baseline: ~same Memory: ✅ 65.849MB (SLO: <67.000MB 🟡 -1.7%) vs baseline: +4.7% ✅ tracer-no-middlewareTime: ✅ 18.920ms (SLO: <21.500ms 📉 -12.0%) vs baseline: -0.5% Memory: ✅ 65.912MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.7% ✅ tracer-no-templatesTime: ✅ 19.069ms (SLO: <22.000ms 📉 -13.3%) vs baseline: -0.3% Memory: ✅ 65.839MB (SLO: <67.000MB 🟡 -1.7%) vs baseline: +4.6% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 16.306ms (SLO: <19.850ms 📉 -17.9%) vs baseline: -0.1% Memory: ✅ 65.658MB (SLO: <66.500MB 🟡 -1.3%) vs baseline: +5.0% ✅ errortracking-enabled-userTime: ✅ 16.331ms (SLO: <19.400ms 📉 -15.8%) vs baseline: -0.3% Memory: ✅ 65.765MB (SLO: <66.500MB 🟡 -1.1%) vs baseline: +5.1% ✅ tracer-enabledTime: ✅ 16.349ms (SLO: <19.450ms 📉 -15.9%) vs baseline: ~same Memory: ✅ 65.712MB (SLO: <66.500MB 🟡 -1.2%) vs baseline: +5.0% 🟡 errortrackingflasksqli - 6/6✅ errortracking-enabled-allTime: ✅ 2.067ms (SLO: <2.300ms 📉 -10.1%) vs baseline: -0.1% Memory: ✅ 52.531MB (SLO: <53.500MB 🟡 -1.8%) vs baseline: +5.1% ✅ errortracking-enabled-userTime: ✅ 2.071ms (SLO: <2.250ms -7.9%) vs baseline: +0.1% Memory: ✅ 52.566MB (SLO: <53.500MB 🟡 -1.7%) vs baseline: +5.4% ✅ tracer-enabledTime: ✅ 2.078ms (SLO: <2.300ms -9.7%) vs baseline: +0.2% Memory: ✅ 52.593MB (SLO: <53.500MB 🟡 -1.7%) vs baseline: +5.1% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.591ms (SLO: <4.750ms -3.3%) vs baseline: ~same Memory: ✅ 62.096MB (SLO: <65.000MB -4.5%) vs baseline: +5.0% ✅ appsec-postTime: ✅ 6.613ms (SLO: <6.750ms -2.0%) vs baseline: ~same Memory: ✅ 62.057MB (SLO: <65.000MB -4.5%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.583ms (SLO: <4.750ms -3.5%) vs baseline: +0.2% Memory: ✅ 62.014MB (SLO: <65.000MB -4.6%) vs baseline: +5.0% ✅ debuggerTime: ✅ 1.851ms (SLO: <2.000ms -7.5%) vs baseline: ~same Memory: ✅ 45.229MB (SLO: <47.000MB -3.8%) vs baseline: +5.0% ✅ iast-getTime: ✅ 1.856ms (SLO: <2.000ms -7.2%) vs baseline: +0.3% Memory: ✅ 42.139MB (SLO: <49.000MB 📉 -14.0%) vs baseline: +4.9% ✅ profilerTime: ✅ 1.913ms (SLO: <2.100ms -8.9%) vs baseline: +0.2% Memory: ✅ 46.485MB (SLO: <47.000MB 🟡 -1.1%) vs baseline: +4.8% ✅ resource-renamingTime: ✅ 3.360ms (SLO: <3.650ms -7.9%) vs baseline: ~same Memory: ✅ 52.600MB (SLO: <53.500MB 🟡 -1.7%) vs baseline: +5.4% ✅ tracerTime: ✅ 3.356ms (SLO: <3.650ms -8.1%) vs baseline: -0.2% Memory: ✅ 52.405MB (SLO: <53.500MB -2.0%) vs baseline: +5.0% ✅ tracer-nativeTime: ✅ 3.359ms (SLO: <3.650ms -8.0%) vs baseline: +0.4% Memory: ✅ 53.975MB (SLO: <60.000MB 📉 -10.0%) vs baseline: +4.7% 🟡 recursivecomputation - 8/8✅ deepTime: ✅ 309.151ms (SLO: <320.950ms -3.7%) vs baseline: ~same Memory: ✅ 32.735MB (SLO: <34.500MB -5.1%) vs baseline: +5.0% ✅ deep-profiledTime: ✅ 328.220ms (SLO: <359.150ms -8.6%) vs baseline: -0.3% Memory: ✅ 38.365MB (SLO: <39.000MB 🟡 -1.6%) vs baseline: +6.2% ✅ mediumTime: ✅ 6.996ms (SLO: <7.400ms -5.5%) vs baseline: -0.2% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +4.7% ✅ shallowTime: ✅ 0.938ms (SLO: <1.050ms 📉 -10.6%) vs baseline: -0.2% Memory: ✅ 31.556MB (SLO: <34.000MB -7.2%) vs baseline: +4.9% 🟡 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.013µs (SLO: <20.000µs 📉 -84.9%) vs baseline: +0.6% Memory: ✅ 31.497MB (SLO: <34.000MB -7.4%) vs baseline: +4.9% ✅ 1-count-metrics-100-timesTime: ✅ 203.751µs (SLO: <220.000µs -7.4%) vs baseline: +0.7% Memory: ✅ 31.497MB (SLO: <34.000MB -7.4%) vs baseline: +4.8% ✅ 1-distribution-metric-1-timesTime: ✅ 3.441µs (SLO: <20.000µs 📉 -82.8%) vs baseline: +2.6% Memory: ✅ 31.536MB (SLO: <34.000MB -7.2%) vs baseline: +5.1% ✅ 1-distribution-metrics-100-timesTime: ✅ 216.617µs (SLO: <220.000µs 🟡 -1.5%) vs baseline: +0.9% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +4.8% ✅ 1-gauge-metric-1-timesTime: ✅ 2.213µs (SLO: <20.000µs 📉 -88.9%) vs baseline: -1.3% Memory: ✅ 31.497MB (SLO: <34.000MB -7.4%) vs baseline: +4.7% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.586µs (SLO: <150.000µs -8.9%) vs baseline: -0.9% Memory: ✅ 31.477MB (SLO: <34.000MB -7.4%) vs baseline: +4.8% ✅ 1-rate-metric-1-timesTime: ✅ 3.221µs (SLO: <20.000µs 📉 -83.9%) vs baseline: +1.6% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +4.7% ✅ 1-rate-metrics-100-timesTime: ✅ 215.165µs (SLO: <250.000µs 📉 -13.9%) vs baseline: ~same Memory: ✅ 31.536MB (SLO: <34.000MB -7.2%) vs baseline: +4.6% ✅ 100-count-metrics-100-timesTime: ✅ 20.428ms (SLO: <22.000ms -7.1%) vs baseline: +0.4% Memory: ✅ 31.575MB (SLO: <34.000MB -7.1%) vs baseline: +5.1% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.267ms (SLO: <2.300ms 🟡 -1.4%) vs baseline: -0.7% Memory: ✅ 31.477MB (SLO: <34.000MB -7.4%) vs baseline: +4.5% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.417ms (SLO: <1.550ms -8.6%) vs baseline: +0.9% Memory: ✅ 31.575MB (SLO: <34.000MB -7.1%) vs baseline: +5.1% ✅ 100-rate-metrics-100-timesTime: ✅ 2.241ms (SLO: <2.550ms 📉 -12.1%) vs baseline: +1.3% Memory: ✅ 31.556MB (SLO: <34.000MB -7.2%) vs baseline: +4.9% ✅ flush-1-metricTime: ✅ 4.774µs (SLO: <20.000µs 📉 -76.1%) vs baseline: +4.1% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +4.8% ✅ flush-100-metricsTime: ✅ 175.311µs (SLO: <250.000µs 📉 -29.9%) vs baseline: +0.6% Memory: ✅ 31.870MB (SLO: <34.000MB -6.3%) vs baseline: +4.7% ✅ flush-1000-metricsTime: ✅ 2.121ms (SLO: <2.500ms 📉 -15.2%) vs baseline: -0.2% Memory: ✅ 32.696MB (SLO: <34.500MB -5.2%) vs baseline: +5.0%
|
006d5a3 to
8ae5e07
Compare
9babe05 to
29df2d1
Compare
29df2d1 to
fc96b94
Compare
579db0e to
0c6d740
Compare
0c6d740 to
779a4f3
Compare
Pull request was converted to draft
779a4f3 to
1785fe5
Compare
pprof_utils.parse_newest_profile
1785fe5 to
85ed1f5
Compare
85ed1f5 to
7494df4
Compare
pprof_utils.parse_newest_profiletest_upload_resets_profile profiling test by using sequence numbers instead of filesystem timestamps
7494df4 to
2105aad
Compare
test_upload_resets_profile profiling test by using sequence numbers instead of filesystem timestampstest_upload_resets_profile profiling test
test_upload_resets_profile profiling testtest_upload_resets_profile test
test_upload_resets_profile testtest_upload_resets_profile test
Indeed, this isn't the case anymore - no sleeps introduced in the fix. |
https://datadoghq.atlassian.net/browse/PROF-12725
Description
Some tests try to read the newest profile and act on it. One such test,
test_upload_resets_profile, is flaky on MacOS (but strangely not on Linux in the CI). It sometimes reads an empty profile (which is expected), and sometimes does not (the empty-file assert is not triggered and test fails).The current way of finding the latest file in
pprof_utils.parse_newest_profileis error-prone because the latest filectime, metadata change time, is not always a true "latest" file; this can happen for a variety of reasons, e.g. one file gets written faster than the other one, because it's smaller ("empty").Example:
file.foo.1created att0(sequence: 1)file.foo.0created att1(sequence: 0)Logically,
file_1is the latest file, because that is the sequence number assigned to it; however, currently we think it isfile_0(t1 > t0).What changed
ctime, we sort by the "logical" time (i.e. the file's sequence number), which is monotonically increasing and deterministic.Testing