-
Notifications
You must be signed in to change notification settings - Fork 748
Optimize exit code handling by relying on scheduler status for successful executions #6484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize exit code handling by relying on scheduler status for successful executions #6484
Conversation
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
|
@bentsherman @pditommaso the google batch task handler is directly getting the exit code form the file (#6481). This PR also removes the fallback to .exitcode when the exit code is 0. It is not a big PR but I was wondering if you prefer to split it in two PRs to facilitate backport to stable versions. The fix for #6481 is just the first commit ea1aa48 |
…g-on-scheduler-status-for-successful-executions
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
pditommaso
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but i'd keep post 25.10
|
@jorgee we talked and agreed to include the google and k8s fixes in 25.10 and merge the rest of this PR in the next edge release |
Fix test method names introduced in PR #6484: - deletePodIfSuccessful -> deleteJobIfSuccessful - savePodLogOnError -> saveJobLogOnError Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Summary
This PR unifies exit code handling behavior across all cloud provider executors (AWS Batch, Azure Batch, Google Batch, and Kubernetes). Previously, different executors had inconsistent approaches to obtaining task exit codes, which led to issues like missing Fusion exit codes (#6481) and unnecessary I/O overhead (#6445).
Changes
Unified Exit Code Strategy
All cloud providers now follow a consistent two-step approach:
.exitcodefile only when API returnsnull(not when it returns0)Fixes by Provider
.exitcodefiletaskExecution.exitCode), fallback to file if nullBenefits
.exitcodefile reads for successful tasks (exit code 0)Closes
Test Coverage
Added comprehensive unit tests for all affected handlers verifying:
.exitcodefile when API returns null