[Misc] Make `SchedulerConfig.max_model_len` init-only #28733

DarkLight1337 · 2025-11-14T16:14:44Z

Purpose

Follow-up to #28665 (comment)

@bnellnm said it's ok to remove max_model_len setting from the MoE tests.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

gemini-code-assist

Code Review

This pull request correctly refactors SchedulerConfig to make max_model_len an init-only variable, which is a good improvement for code clarity and consistency. The related changes across the codebase to use model_config.max_model_len are also correct. However, I've identified a critical issue in vllm/config/scheduler.py where a piece of logic was not updated to reflect this change, which will lead to a runtime error. Please see the detailed comment.

vllm/config/scheduler.py

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

mergify · 2025-11-14T18:50:15Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: George D. Torres <gdavtor@gmail.com>

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Bram Wasti <bwasti@meta.com>

Bump vLLM version to v0.11.2 What's broken and changed by vLLM: 1. structured_output is broken by vllm-project/vllm#26866 2. get_mrope_input_positions is broken by vllm-project/vllm#28399 3. graph mode is broken by vllm-project/vllm#25110 we'll upgrade torch to 2.8 to fix the problem later 4. embedding is broken by vllm-project/vllm#27583 5. `get_attn_backend_cls` and attention backend is broken are broken by vllm-project/vllm#28534 6. spec decode is broken by vllm-project/vllm#28771 7. sp feature is broken by vllm-project/vllm#27126 8. mtp is broken by vllm-project/vllm#27922 9. lora is broken by vllm-project/vllm#21068 10. execute_model is broken by vllm-project/vllm#26866 11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by vllm-project/vllm#28159 12. kv cahe is broken by vllm-project/vllm#27753 13. dp is broken by vllm-project/vllm#25110 What's broken and changed by ourself: 1. qwen vl is broken by vllm-project/vllm#28455 We'll remove model files in the future to avoid this kind of error 2. Engine core is broken by vllm-project/vllm#23691 We'll remove the patch file in the future. 3. Ascend scheduler is broken by vllm-project/vllm#28733 We'll remove ascend scheudler later. 4. qwen3-next is broken by vllm-project/vllm#28083 We'll remove model files in the future to avoid this kind of error 5. qwen vl is broken by vllm-project/vllm#27764. We'll remove model files in the future Known issue: 1. ray doesn't work 2. the accuracy of qwen3-next is not correct 3. qwen3-vl is broken 4. prefix cache+ ascend scheduler + deepseek v2 lite is broken. Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: 22dimensions <waitingwind@foxmail.com> Co-authored-by: shen-shanshan <467638484@qq.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: leo-pony <nengjunma@outlook.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com>

Bump vLLM version to v0.11.2 What's broken and changed by vLLM: 1. structured_output is broken by vllm-project/vllm#26866 2. get_mrope_input_positions is broken by vllm-project/vllm#28399 3. graph mode is broken by vllm-project/vllm#25110 we'll upgrade torch to 2.8 to fix the problem later 4. embedding is broken by vllm-project/vllm#27583 5. `get_attn_backend_cls` and attention backend is broken are broken by vllm-project/vllm#28534 6. spec decode is broken by vllm-project/vllm#28771 7. sp feature is broken by vllm-project/vllm#27126 8. mtp is broken by vllm-project/vllm#27922 9. lora is broken by vllm-project/vllm#21068 10. execute_model is broken by vllm-project/vllm#26866 11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by vllm-project/vllm#28159 12. kv cahe is broken by vllm-project/vllm#27753 13. dp is broken by vllm-project/vllm#25110 What's broken and changed by ourself: 1. qwen vl is broken by vllm-project/vllm#28455 We'll remove model files in the future to avoid this kind of error 2. Engine core is broken by vllm-project/vllm#23691 We'll remove the patch file in the future. 3. Ascend scheduler is broken by vllm-project/vllm#28733 We'll remove ascend scheudler later. 4. qwen3-next is broken by vllm-project/vllm#28083 We'll remove model files in the future to avoid this kind of error 5. qwen vl is broken by vllm-project/vllm#27764. We'll remove model files in the future Known issue: 1. ray doesn't work 2. the accuracy of qwen3-next is not correct 3. qwen3-vl is broken 4. prefix cache+ ascend scheduler + deepseek v2 lite is broken. Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: 22dimensions <waitingwind@foxmail.com> Co-authored-by: shen-shanshan <467638484@qq.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: leo-pony <nengjunma@outlook.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Signed-off-by: Kurumi5210 <Jaychou1620@Gmail.com>

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Bump vLLM version to v0.11.2 What's broken and changed by vLLM: 1. structured_output is broken by vllm-project/vllm#26866 2. get_mrope_input_positions is broken by vllm-project/vllm#28399 3. graph mode is broken by vllm-project/vllm#25110 we'll upgrade torch to 2.8 to fix the problem later 4. embedding is broken by vllm-project/vllm#27583 5. `get_attn_backend_cls` and attention backend is broken are broken by vllm-project/vllm#28534 6. spec decode is broken by vllm-project/vllm#28771 7. sp feature is broken by vllm-project/vllm#27126 8. mtp is broken by vllm-project/vllm#27922 9. lora is broken by vllm-project/vllm#21068 10. execute_model is broken by vllm-project/vllm#26866 11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by vllm-project/vllm#28159 12. kv cahe is broken by vllm-project/vllm#27753 13. dp is broken by vllm-project/vllm#25110 What's broken and changed by ourself: 1. qwen vl is broken by vllm-project/vllm#28455 We'll remove model files in the future to avoid this kind of error 2. Engine core is broken by vllm-project/vllm#23691 We'll remove the patch file in the future. 3. Ascend scheduler is broken by vllm-project/vllm#28733 We'll remove ascend scheudler later. 4. qwen3-next is broken by vllm-project/vllm#28083 We'll remove model files in the future to avoid this kind of error 5. qwen vl is broken by vllm-project/vllm#27764. We'll remove model files in the future Known issue: 1. ray doesn't work 2. the accuracy of qwen3-next is not correct 3. qwen3-vl is broken 4. prefix cache+ ascend scheduler + deepseek v2 lite is broken. Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: 22dimensions <waitingwind@foxmail.com> Co-authored-by: shen-shanshan <467638484@qq.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: leo-pony <nengjunma@outlook.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com>

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Misc] Make SchedulerConfig.max_model_len init-only

c2d7748

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 14, 2025

DarkLight1337 requested review from ApostaC, NickLucche, ProExpertProg, WoosukKwon, alexm-redhat, bigPYJ1151, heheda12345, hmellor, houseroad, jeejeelee, jikunshang, mgoin, njhill, robertgshaw2-redhat, tlrmchlsmth, yewentao256, youkaichao and ywang96 as code owners November 14, 2025 16:14

mergify bot added nvidia v1 tpu Related to Google TPUs labels Nov 14, 2025

github-project-automation bot added this to NVIDIA Nov 14, 2025

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

vllm/config/scheduler.py Show resolved Hide resolved

Fix

34ad015

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

hmellor approved these changes Nov 14, 2025

View reviewed changes

github-project-automation bot moved this to In review in NVIDIA Nov 14, 2025

DarkLight1337 added 2 commits November 14, 2025 17:11

Revert

3755a83

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Revert

fe54db6

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 enabled auto-merge (squash) November 14, 2025 17:36

mergify bot added the needs-rebase label Nov 14, 2025

Merge branch 'main' into rm-scheduler-max-model-len

3bf3f09

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

mergify bot removed the needs-rebase label Nov 15, 2025

vllm-bot merged commit 638e419 into vllm-project:main Nov 15, 2025
47 of 48 checks passed

github-project-automation bot moved this from In review to Done in NVIDIA Nov 15, 2025

DarkLight1337 deleted the rm-scheduler-max-model-len branch November 15, 2025 09:59

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025

[Misc] Make SchedulerConfig.max_model_len init-only (vllm-project#2…

b5497eb

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: George D. Torres <gdavtor@gmail.com>

bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025

[Misc] Make SchedulerConfig.max_model_len init-only (vllm-project#2…

f2ad626

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Bram Wasti <bwasti@meta.com>

wangxiyuan mentioned this pull request Nov 25, 2025

upgrade to vllm 0.11.2 vllm-project/vllm-ascend#4400

Merged

bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025

[Misc] Make SchedulerConfig.max_model_len init-only (vllm-project#2…

3eaf2b7

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Misc] Make SchedulerConfig.max_model_len init-only (vllm-project#2…

6f3308d

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

frank-wei mentioned this pull request Dec 1, 2025

[Misc] Throw error on unintended access to scheduler_config.max_model_len #29771

Merged

5 tasks

kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025

[Misc] Make SchedulerConfig.max_model_len init-only (vllm-project#2…

c277e40

…8733) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Make `SchedulerConfig.max_model_len` init-only #28733

[Misc] Make `SchedulerConfig.max_model_len` init-only #28733

Uh oh!

DarkLight1337 commented Nov 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Misc] Make SchedulerConfig.max_model_len init-only #28733

[Misc] Make SchedulerConfig.max_model_len init-only #28733

Uh oh!

Conversation

DarkLight1337 commented Nov 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Misc] Make `SchedulerConfig.max_model_len` init-only #28733

[Misc] Make `SchedulerConfig.max_model_len` init-only #28733

DarkLight1337 commented Nov 14, 2025 •

edited by github-actions bot

Loading