-
Notifications
You must be signed in to change notification settings - Fork 373
If no FailureLines exist for a failure, fallback to TestLogFailure #9079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9079 +/- ##
==========================================
- Coverage 80.26% 80.10% -0.16%
==========================================
Files 596 596
Lines 32182 32265 +83
Branches 3276 3269 -7
==========================================
+ Hits 25830 25845 +15
- Misses 6184 6252 +68
Partials 168 168 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jmaher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that this is solving for the case where we are pending log parsing- but that is a normal thing, yet probably not so misleading. If you think this will make push health more responsive and accurate (even the perception), then let me know
| total_jobs_for_type, | ||
| is_investigated, | ||
| investigated_test_id, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this end up with a duplicate entry, assuming we have:
TEST-UNEXPECTED-FAIL | test_1.js | timed out
TEST-CRASH | | test_1.js
in the end we don't need both, I will be curious where this ends up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a look now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we'd just get whichever one comes first. So likely the timeout. Would it be better to show the crash instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, whatever has the most information. I think a crash is a better signal than timeout the majority of the time. I guess we could replace (add/delete) if a crash comes in after the timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I made the change to use the crash, rather than timeout if there is both.
This is a really good point. I added logic to check if the job's log parsing is still |
An experiment to see how this looks.
Here were the options Claude proposed. I went with Option 1 in this PR:
Option 1: Make Push Health use TextLogError as a fallback ✅ RECOMMENDED
Option 2: Investigate and fix FailureLine generation
Option 3: Create FailureLine objects from TextLogError
My Recommendation
Start with Option 1 (fallback to TextLogError) because: