Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. #26199

laithsakka · 2025-10-03T21:54:03Z

Purpose

Dynamic shapes guards are dropped unsoundly in vLLM, those can be added during both dynamo and inductor compilations. The proper way to compile a sounds graph is to use unbacked dynamic shapes or ""backed size oblivious
maybe"" ! (see the note in the diff content). This PR adds a config that allow choosing between backed, unbacked and backed_size_oblivious as explained in the comment in the content of the PR.

When using unbacked, users might want to to provide invariants about the shapes of the Model, a lambda argument
is added to support_torch_compile decorators where users can provide a lambda that assert on invariants
on the model inputs shapes. Those are needed to avoid DDE and to be able to trace the model with unbacked.
example:


def llama_model_invariants(input_ids,
                           positions,
                           intermediate_tensors=None,
                           inputs_embeds=None):
    """Shape invariants for Llama model compilation"""
    if input_ids is not None:
        torch._check(positions.size()[0] == input_ids.size()[0])
        
@support_torch_compile(shape_invariants=llama_model_invariants)
..

see the qwen example in the code also.

Test Plan

Added a unit test .
only works for torch. 2.10+

Perf testing

command

Qwen/Qwen2-1.5B-Instruct

will look into some of those perf issues in the future, but the long term vision with pre_compile is that this will be
a fallback mode. results anyway still better than eager.

CUDA_VISIBLE_DEVICES=1 vllm bench throughput --model Qwen/Qwen2-1.5B-Instruct --input-len 512 --output-len 128 --num-prompts 100 --gpu-memory-utilization 0.8

backed

Throughput: 88.94 requests/s, 56738.83 total tokens/s, 11383.87 output tokens/s

backed size oblivious

88.78

unbacked

Throughput: 82.98  51860.18 total tokens/s, 10405.04 output tokens/s

eager

Throughput: 63.29 requests/s, 40376.23 total tokens/s, 8100.94 output tokens/s

mergify · 2025-10-03T21:54:46Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @laithsakka.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request introduces a significant and valuable refactoring of the torch.compile integration, particularly around handling dynamic shapes. The new TorchCompileGuardsStripWrapper and the introduction of shape invariants provide a much cleaner and more robust approach to compilation. The changes are well-structured and the new tests are a great addition.

I've found a few issues that need to be addressed: a critical bug in the new DynamicShapesConfig that would cause a runtime error, a minor bug in hash computation, and an opportunity to strengthen the new dynamic shapes test.

vllm/config/compilation.py

tests/compile/test_dynamic_shapes_compilation.py

hmellor

The conflicts are caused by our migration to ruff. Please see https://vllm-dev.slack.com/archives/C07R5Q1Q2BB/p1759663228844749 which contains detailed instructions to make updating your branch as painless as possible.

vllm/config/compilation.py

tests/compile/test_dynamic_shapes_compilation.py

ProExpertProg · 2025-11-20T19:18:56Z

@laithsakka removed ready, ping when you want CI back on

zou3519 · 2025-11-20T21:31:54Z

docs/design/debug_vllm_compile.md

+))
+```
+
+These modes are stricter and reduce or eliminate guarding, which can help isolate issues:


There is some confusion between dynamic shape guards, and dynamo guards in general. There is also some conclusion, because vLLM drops all Dynamo guards -- why do these guards still matter?

ok let me try phrase it differently, any specific phrasing you like there?

laithsakka · 2025-11-20T23:34:08Z

@ProExpertProg
why removed ready i addressed all comments this morning?
I kind of need the ready also to know if things break.

ProExpertProg · 2025-11-20T23:44:21Z

Sorry I thought you were still pushing, but also the CI ran already anyway

laithsakka · 2025-11-20T23:47:33Z

just addressed richard comment on the debug section. @zou3519 looking good?

zou3519 · 2025-11-21T01:16:39Z

docs/design/debug_vllm_compile.md

 2. wrap the branching logic into a custom operator. TorchDynamo does not
 trace into custom operators.

+## Debugging constraint violations and dynamic shapes guards issues


this is a bit of a repeat of the previous section. Let's resolve it in a future PR, figuring out what to write is a bit annoying

ah i see

## Debugging Dynamic Shape full graph capture

we probably should merge the two i agree

laithsakka · 2025-11-21T16:07:50Z

@ProExpertProg the failing job is a timeout in building docs is there a way to retry?

ProExpertProg · 2025-11-21T16:09:23Z

@hmellor

Signed-off-by: Laith Sakka <lsakka@meta.com>

laithsakka · 2025-11-22T01:35:22Z

dummy update to force running all tests/

hmellor · 2025-11-24T09:54:45Z

vllm/config/compilation.py

+    - BACKED: Default PyTorch behavior with potential guards ignored.
+    - UNBACKED: No guards guaranteed (most sound) but may throw
+      data dependent errors.
+    - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to
+      backed/unbacked.


Tiny nit to get the help text to render nicely. Should bhe done in a follow up to save CI

Suggested change

- BACKED: Default PyTorch behavior with potential guards ignored.

- UNBACKED: No guards guaranteed (most sound) but may throw

data dependent errors.

- BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to

backed/unbacked.

- BACKED: Default PyTorch behavior with potential guards ignored.\n

- UNBACKED: No guards guaranteed (most sound) but may throw

data dependent errors.\n

- BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to

backed/unbacked.

@laithsakka

hmellor · 2025-11-24T09:55:34Z

Docs failure/timeout was likely related to the Python docs (which our docs references) being down. It appears to have resolved itself

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com> Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>

laithsakka requested review from NickLucche, ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, sighingnow, simon-mo, tlrmchlsmth, yewentao256, youkaichao and zou3519 as code owners October 3, 2025 21:54

mergify bot added llama Related to Llama models qwen Related to Qwen models v1 tpu Related to Google TPUs labels Oct 3, 2025

mergify bot added the needs-rebase label Oct 3, 2025

laithsakka changed the title Hu Add option to use unbacked dynamic shapes for more sounds compilation. Oct 3, 2025

gemini-code-assist bot reviewed Oct 3, 2025

View reviewed changes

vllm/config/compilation.py Show resolved Hide resolved

tests/compile/test_dynamic_shapes_compilation.py Outdated Show resolved Hide resolved

laithsakka force-pushed the hu branch from 3f0a094 to cc295da Compare October 3, 2025 22:00

laithsakka changed the title ~~Add option to use unbacked dynamic shapes for more sounds compilation.~~ Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. Oct 3, 2025

laithsakka force-pushed the hu branch from cc295da to 90f6746 Compare October 3, 2025 23:14

hmellor reviewed Oct 8, 2025

View reviewed changes

vllm/config/compilation.py Outdated Show resolved Hide resolved

laithsakka force-pushed the hu branch from 90f6746 to f238716 Compare October 16, 2025 23:01

mergify bot removed the needs-rebase label Oct 16, 2025

laithsakka force-pushed the hu branch from f238716 to 3f78104 Compare October 21, 2025 19:50

laithsakka mentioned this pull request Oct 28, 2025

Add evaluate_guards option to DynamicShapesConfig #27432

Open

zou3519 reviewed Nov 3, 2025

View reviewed changes

tests/compile/test_dynamic_shapes_compilation.py Show resolved Hide resolved

laithsakka force-pushed the hu branch from 58467fe to 73f7393 Compare November 20, 2025 18:36

ProExpertProg removed the ready ONLY add when PR is ready to merge/full CI is needed label Nov 20, 2025

zou3519 reviewed Nov 20, 2025

View reviewed changes

laithsakka force-pushed the hu branch from 73f7393 to 487fbf2 Compare November 20, 2025 23:47

laithsakka requested a review from zou3519 November 20, 2025 23:47

laithsakka force-pushed the hu branch 4 times, most recently from 898d8b0 to 0e849f0 Compare November 21, 2025 00:07

zou3519 reviewed Nov 21, 2025

View reviewed changes

zou3519 approved these changes Nov 21, 2025

View reviewed changes

zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 21, 2025

laithsakka force-pushed the hu branch from 0e849f0 to a7b82a2 Compare November 22, 2025 01:34

add unbacked option

426bac8

Signed-off-by: Laith Sakka <lsakka@meta.com>

laithsakka force-pushed the hu branch from a7b82a2 to 426bac8 Compare November 22, 2025 01:35

hmellor reviewed Nov 24, 2025

View reviewed changes

zou3519 merged commit 7a228b5 into vllm-project:main Nov 24, 2025
55 checks passed

github-project-automation bot moved this from In review to Done in torch.compile integration Nov 24, 2025

lpapavassiliou pushed a commit to lpapavassiliou/vllm that referenced this pull request Nov 24, 2025

Add option to use unbacked, and backed size obl dynamic shapes for mo…

5af2353

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Nov 24, 2025

Add option to use unbacked, and backed size obl dynamic shapes for mo…

180b9fc

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>

bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025

Add option to use unbacked, and backed size obl dynamic shapes for mo…

648d2a5

…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>

Uh oh!

Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. #26199

Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. #26199

Conversation

laithsakka commented Oct 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Perf testing

Qwen/Qwen2-1.5B-Instruct

Uh oh!

mergify bot commented Oct 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ProExpertProg commented Nov 20, 2025

Uh oh!

zou3519 Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

laithsakka Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

laithsakka commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProExpertProg commented Nov 20, 2025

Uh oh!

laithsakka commented Nov 20, 2025

Uh oh!

zou3519 Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

laithsakka Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

laithsakka commented Nov 21, 2025

Uh oh!

ProExpertProg commented Nov 21, 2025

Uh oh!

laithsakka commented Nov 22, 2025

Uh oh!

hmellor Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

hmellor commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

laithsakka commented Oct 3, 2025 •

edited by github-actions bot

Loading

laithsakka commented Nov 20, 2025 •

edited

Loading

zou3519 Nov 21, 2025 •

edited

Loading

laithsakka Nov 21, 2025 •

edited

Loading

hmellor Nov 24, 2025 •

edited

Loading