-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Misc] Force _init_reorder_batch_threshold to be called and make get_reorder_batch_threshold an instance property
#28750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively refactors the handling of reorder_batch_threshold across various attention backends. The changes successfully enforce a consistent initialization pattern by removing the class variable and introducing an instance property _reorder_batch_threshold set via _init_reorder_batch_threshold. This improves code clarity and maintainability. The new public getter get_reorder_batch_threshold and the added assertions ensure that the threshold is properly initialized and accessed. The centralization of the Decode Context Parallelism (DCP) logic is also a good improvement. The changes are consistent and well-executed across all affected files. I have not found any high or critical issues in this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| min_none_high = lambda a, b: a if b is None else b if a is None else min(a, b) | ||
|
|
||
| reorder_batch_thresholds = [ | ||
| group.get_metadata_builder().reorder_batch_threshold | ||
| group.get_metadata_builder().get_reorder_batch_threshold() | ||
| for group in self._attn_group_iterator() | ||
| ] | ||
| # If there are no attention groups (attention-free model) or no backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deepseek indexer no longer reports its reorder threshold
The new logic collects per-backend reorder thresholds via get_reorder_batch_threshold(), but AttentionMetadataBuilder.get_reorder_batch_threshold only returns a value when subclasses set _reorder_batch_threshold via _init_reorder_batch_threshold. DeepseekV32IndexerMetadataBuilder still relies on the old reorder_batch_threshold attribute and never calls _init_reorder_batch_threshold, so the getter returns None. After this change the model runner will never reorder batches for that backend even though DeepseekV32IndexerMetadataBuilder.build() still splits decodes/prefills using reorder_batch_threshold and assumes the batch has already been reordered. Mixed decode/prefill batches will therefore be processed with mismatched ranges. Consider updating the builder to initialize _reorder_batch_threshold (or make the getter fall back to the legacy attribute) so the runner keeps reordering as before.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LucasWilkinson I think this comment makes sense.
_init_reorder_batch_threshold to be called and make get_reorder_batch_threshold an instance property_init_reorder_batch_threshold to be called and make get_reorder_batch_threshold an instance property
heheda12345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the high-level idea. But what will happen if a new backend forgets to call _init_reorder_batch_threshold?
|
This pull request has merge conflicts that must be resolved before it can be |
reorder_batch_thresholdhas become much more dynamic in nature with the introduction of spec-decode support in many backends that split decodes and prefills and DCP. This was partially addressed with the introduction of_init_reorder_batch_thresholdhowever this leftreorder_batch_thresholdin a slightly confusion place where some backends would forget to call_init_reorder_batch_thresholddespite needing to; or exposing one via the a ClassVar only to update it later (reducing readability).This PR is an attempt to force all backends to use
_init_reorder_batch_thresholdso we have more consistency.