【fix】: add attn_groups judge #4590

wucong25 · 2025-12-01T03:49:11Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

gemini-code-assist

Code Review

This pull request attempts to prevent a crash when initialize_kv_cache is called multiple times by adding a guard around the initialize_attn_backend call. While this fixes the immediate crash, the fix is incomplete and introduces a risk of state corruption because other parts of the method are not idempotent. My review includes a suggestion for a more robust fix that makes the entire method idempotent, ensuring correct behavior on subsequent calls.

gemini-code-assist · 2025-12-01T03:51:10Z

vllm_ascend/worker/model_runner_v1.py

        self.kv_cache_config = kv_cache_config
        self.may_add_encoder_only_layers_to_kv_cache_config()
        # NOTE(cmq): initialize_attn_backend must before using self.attn_groups
-        self.initialize_attn_backend(kv_cache_config)
+        if not self.attn_groups:
+            self.initialize_attn_backend(kv_cache_config)


This change prevents a crash if initialize_kv_cache is called multiple times, but it's an incomplete fix and can lead to subtle bugs. The function may_add_encoder_only_layers_to_kv_cache_config() is still called unconditionally on each invocation. This function appends to self.kv_cache_config.kv_cache_groups, which will result in duplicate entries if initialize_kv_cache is called more than once. Subsequent parts of this method, like may_reinitialize_input_batch and initialize_kv_cache_tensors, will then operate with an inconsistent state (a modified kv_cache_config but stale attn_groups), which is likely to cause issues.

A more robust solution is to make the entire method idempotent by checking for initialization at the beginning of the function. This ensures that if the method is called multiple times, it has no effect after the first successful initialization, preventing state corruption.

Suggested change

self.kv_cache_config = kv_cache_config

self.may_add_encoder_only_layers_to_kv_cache_config()

# NOTE(cmq): initialize_attn_backend must before using self.attn_groups

self.initialize_attn_backend(kv_cache_config)

if not self.attn_groups:

self.initialize_attn_backend(kv_cache_config)

if self.attn_groups:

return

self.kv_cache_config = kv_cache_config

self.may_add_encoder_only_layers_to_kv_cache_config()

# NOTE(cmq): initialize_attn_backend must before using self.attn_groups

self.initialize_attn_backend(kv_cache_config)

github-actions · 2025-12-01T03:59:56Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: wucong25 <wucong25@huawei.com>

weijinqian0 · 2025-12-02T12:31:36Z

This PR is not common in RL scenarios.

gemini-code-assist bot reviewed Dec 1, 2025

View reviewed changes

add attn_groups judge

381f0f4

Signed-off-by: wucong25 <wucong25@huawei.com>

wucong25 force-pushed the wucong/fix_vllm branch from 682865e to 381f0f4 Compare December 1, 2025 06:08

wangxiyuan added the hold-on The PR should be hold-on but no need to release label Dec 2, 2025

weijinqian0 closed this Dec 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

【fix】: add attn_groups judge #4590

【fix】: add attn_groups judge #4590

wucong25 commented Dec 1, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

weijinqian0 commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

【fix】: add attn_groups judge #4590

【fix】: add attn_groups judge #4590

Conversation

wucong25 commented Dec 1, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

weijinqian0 commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants