Resolve issue with async scheduling when decode and prompt tokens are mixed #642

tianmu-li · 2025-11-26T19:30:14Z

When decode tokens are not strictly before prompt tokens, tokens from the previous batch cannot be copied using :num_decodes when using async scheduling.

… mixed Signed-off-by: Tianmu Li <tianmu.li@intel.com>

Copilot

Pull request overview

This PR resolves an issue with async scheduling when decode and prompt tokens are mixed in a batch. The fix ensures that tokens from the previous batch can be correctly copied to their target positions when decode tokens are not strictly positioned before prompt tokens.

Key Changes:

Modified _prepare_input_ids to optionally return index tensor for reordered batches
Updated create_unified_batch to accept and use decode indices for correct token placement

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
vllm_gaudi/v1/worker/hpu_model_runner.py	Added `return_index` parameter to `_prepare_input_ids` and logic to return index tensor when batch is reordered; updated batch preparation to pass decode index
vllm_gaudi/extension/unified_batch.py	Added `decode_index` parameter and conditional logic to copy tokens using indices instead of assuming sequential placement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_gaudi/v1/worker/hpu_model_runner.py

vllm_gaudi/extension/unified_batch.py

Signed-off-by: Tianmu Li <tianmu.li@intel.com>

github-actions · 2025-11-26T20:40:25Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
0353d2e162cbda776d9dbfe026e65303204a7f1f

… mixed (vllm-project#642) When decode tokens are not strictly before prompt tokens, tokens from the previous batch cannot be copied using :num_decodes when using async scheduling. --------- Signed-off-by: Tianmu Li <tianmu.li@intel.com>

… mixed (#678) Cherrypick of #642 Signed-off-by: Tianmu Li <tianmu.li@intel.com>

Resolve issue with async scheduling when decode and prompt tokens are…

24a0a71

… mixed Signed-off-by: Tianmu Li <tianmu.li@intel.com>

Copilot AI review requested due to automatic review settings November 26, 2025 19:30

tianmu-li requested review from adobrzyn, afierka-intel, iboiko-habana, kamil-kaczor, ksmusz, kzawora-intel, mgawarkiewicz-intel, michalkuligowski, vivekgoe and xuechendi as code owners November 26, 2025 19:30

Copilot AI reviewed Nov 26, 2025

View reviewed changes

vllm_gaudi/v1/worker/hpu_model_runner.py Outdated Show resolved Hide resolved

vllm_gaudi/extension/unified_batch.py Outdated Show resolved Hide resolved

copilot fixes

39c394b

Signed-off-by: Tianmu Li <tianmu.li@intel.com>

adobrzyn approved these changes Dec 3, 2025

View reviewed changes

adobrzyn merged commit 927dafa into vllm-project:main Dec 3, 2025
43 checks passed

tianmu-li mentioned this pull request Dec 3, 2025

Resolve issue with async scheduling when decode and prompt tokens are mixed #678

Merged

mgawarkiewicz-intel pushed a commit that referenced this pull request Dec 4, 2025

Resolve issue with async scheduling when decode and prompt tokens are…

6fc04ba

… mixed (#678) Cherrypick of #642 Signed-off-by: Tianmu Li <tianmu.li@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resolve issue with async scheduling when decode and prompt tokens are mixed #642

Resolve issue with async scheduling when decode and prompt tokens are mixed #642

tianmu-li commented Nov 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Resolve issue with async scheduling when decode and prompt tokens are mixed #642

Resolve issue with async scheduling when decode and prompt tokens are mixed #642

Conversation

tianmu-li commented Nov 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 26, 2025

✅ CI Passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants