-
Notifications
You must be signed in to change notification settings - Fork 648
Procedures: fix scheduling #3704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
79f76cb to
b31ec3a
Compare
b31ec3a to
3fa9650
Compare
gefjon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks reasonable to me, but I'm not super familiar with our scheduling code. I'd like to get @Shubham8287 's review before merging, if possible.
|
It occurs to me - for a procedure, you have to delete the schedule row before invoking, not after. If execution aborts (host crash or w/e) midway through a scheduled procedure, it's not correct to retry that procedure on restart the way it is for a scheduled reducer, so you need the row to be gone by the time the procedure performs any observable side effect. I think that this means that we should:
There's some additional complexity here that we'll get to if/when we want to implement on-abort handling, but we haven't been considering that part of the MVP. |
This is a atmost once gurantee then, we have to also cosider for the case when Host fatals or shutdown after row deletion, procedure will never run. Edit: nvm, just read, you mentioned it is not required to retry for this case. |
gefjon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:excellent:
84decda to
b5f4468
Compare
b5f4468 to
d52fcfb
Compare
12e5f79 to
a722cab
Compare
This reverts commit b2e37e8. # Description of Changes <!-- Please describe your change, mention any related tickets, and so on here. --> Reverts #3704 which I'm pretty sure contains some sort of bug which is causing the smoketests to hang. # API and ABI breaking changes None <!-- If this is an API or ABI breaking change, please apply the corresponding GitHub label. --> # Expected complexity level and risk 1 <!-- How complicated do you think these changes are? Grade on a scale from 1 to 5, where 1 is a trivial change, and 5 is a deep-reaching and complex change. This complexity rating applies not only to the complexity apparent in the diff, but also to its interactions with existing and future code. If you answered more than a 2, explain what is complex about the PR, and what other components it interacts with in potentially concerning ways. --> # Testing <!-- Describe any testing you've done, and any testing you'd like your reviewers to do, so that you're confident that all the changes work as expected! --> - [x] CI passing again
Description of Changes
Reworks how
SchedulerActor::handle_queuedworks so that it first determines the parameters of the call to a reducer or the parameters of the call to the procedure. This also enables the removal of the special casecall_scheduled_reducer.Fixes #3645.
API and ABI breaking changes
None
Expected complexity level and risk
2
Testing
A test
schedule_procedureis added.