Skip to content

Conversation

@LucasWilkinson
Copy link
Collaborator

@LucasWilkinson LucasWilkinson commented Nov 14, 2025

reorder_batch_threshold has become much more dynamic in nature with the introduction of spec-decode support in many backends that split decodes and prefills and DCP. This was partially addressed with the introduction of _init_reorder_batch_threshold however this left reorder_batch_threshold in a slightly confusion place where some backends would forget to call _init_reorder_batch_threshold despite needing to; or exposing one via the a ClassVar only to update it later (reducing readability).

This PR is an attempt to force all backends to use _init_reorder_batch_threshold so we have more consistency.

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the handling of reorder_batch_threshold across various attention backends. The changes successfully enforce a consistent initialization pattern by removing the class variable and introducing an instance property _reorder_batch_threshold set via _init_reorder_batch_threshold. This improves code clarity and maintainability. The new public getter get_reorder_batch_threshold and the added assertions ensure that the threshold is properly initialized and accessed. The centralization of the Decode Context Parallelism (DCP) logic is also a good improvement. The changes are consistent and well-executed across all affected files. I have not found any high or critical issues in this pull request.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 4338 to 4344
min_none_high = lambda a, b: a if b is None else b if a is None else min(a, b)

reorder_batch_thresholds = [
group.get_metadata_builder().reorder_batch_threshold
group.get_metadata_builder().get_reorder_batch_threshold()
for group in self._attn_group_iterator()
]
# If there are no attention groups (attention-free model) or no backend

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Deepseek indexer no longer reports its reorder threshold

The new logic collects per-backend reorder thresholds via get_reorder_batch_threshold(), but AttentionMetadataBuilder.get_reorder_batch_threshold only returns a value when subclasses set _reorder_batch_threshold via _init_reorder_batch_threshold. DeepseekV32IndexerMetadataBuilder still relies on the old reorder_batch_threshold attribute and never calls _init_reorder_batch_threshold, so the getter returns None. After this change the model runner will never reorder batches for that backend even though DeepseekV32IndexerMetadataBuilder.build() still splits decodes/prefills using reorder_batch_threshold and assumes the batch has already been reordered. Mixed decode/prefill batches will therefore be processed with mismatched ranges. Consider updating the builder to initialize _reorder_batch_threshold (or make the getter fall back to the legacy attribute) so the runner keeps reordering as before.

Useful? React with 👍 / 👎.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LucasWilkinson I think this comment makes sense.

@LucasWilkinson LucasWilkinson changed the title [WIP] Force _init_reorder_batch_threshold to be called and make get_reorder_batch_threshold an instance property [Misc] Force _init_reorder_batch_threshold to be called and make get_reorder_batch_threshold an instance property Nov 17, 2025
@LucasWilkinson LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 18, 2025
Copy link
Collaborator

@heheda12345 heheda12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the high-level idea. But what will happen if a new backend forgets to call _init_reorder_batch_threshold?

@mergify
Copy link

mergify bot commented Nov 21, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @LucasWilkinson.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-rebase nvidia ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants