Skip to content

Conversation

@zhuohan123
Copy link
Member

@zhuohan123 zhuohan123 commented Nov 15, 2025

Purpose

The device argument for reset_prefix_cache is no longer being used in V1.

Test Plan

Existing tests should pass.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to remove the unused device argument from the reset_prefix_cache method across various components. The changes in the modified files are consistent with this goal. However, there is a critical issue: the base class EngineClient in vllm/engine/protocol.py has not been updated. This will cause a signature mismatch for the reset_prefix_cache method in the AsyncLLM class, which inherits from EngineClient. This file needs to be included in the pull request with the necessary changes to ensure the code compiles and runs correctly.

Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
@zhuohan123
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly removes the unused device argument from the reset_prefix_cache function across multiple files. The changes are consistent, simplifying the function signatures and removing unnecessary logic. This refactoring improves code clarity and maintainability. I noticed a minor discrepancy: the pull request title refers to reset_kv_cache, while the changes and description correctly target reset_prefix_cache. You may want to update the title for consistency. Overall, the changes are well-executed and I have no further comments.

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 15, 2025
@zhuohan123 zhuohan123 enabled auto-merge (squash) November 15, 2025 07:22
@vllm-bot vllm-bot merged commit dd6ac1c into main Nov 15, 2025
52 of 53 checks passed
@vllm-bot vllm-bot deleted the zhuohan/remove-device-from-reset-kv-cache branch November 15, 2025 07:59
@DarkLight1337
Copy link
Member

Doc failure is unrelated - merging

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025
…ject#28766)

Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Signed-off-by: George D. Torres <gdavtor@gmail.com>
bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025
…ject#28766)

Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Signed-off-by: Bram Wasti <bwasti@meta.com>
bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants