-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. #26199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and valuable refactoring of the torch.compile integration, particularly around handling dynamic shapes. The new TorchCompileGuardsStripWrapper and the introduction of shape invariants provide a much cleaner and more robust approach to compilation. The changes are well-structured and the new tests are a great addition.
I've found a few issues that need to be addressed: a critical bug in the new DynamicShapesConfig that would cause a runtime error, a minor bug in hash computation, and an opportunity to strengthen the new dynamic shapes test.
hmellor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conflicts are caused by our migration to ruff. Please see https://vllm-dev.slack.com/archives/C07R5Q1Q2BB/p1759663228844749 which contains detailed instructions to make updating your branch as painless as possible.
|
@laithsakka removed ready, ping when you want CI back on |
docs/design/debug_vllm_compile.md
Outdated
| )) | ||
| ``` | ||
|
|
||
| These modes are stricter and reduce or eliminate guarding, which can help isolate issues: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some confusion between dynamic shape guards, and dynamo guards in general. There is also some conclusion, because vLLM drops all Dynamo guards -- why do these guards still matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok let me try phrase it differently, any specific phrasing you like there?
|
@ProExpertProg |
|
Sorry I thought you were still pushing, but also the CI ran already anyway |
|
just addressed richard comment on the debug section. @zou3519 looking good? |
898d8b0 to
0e849f0
Compare
| 2. wrap the branching logic into a custom operator. TorchDynamo does not | ||
| trace into custom operators. | ||
|
|
||
| ## Debugging constraint violations and dynamic shapes guards issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit of a repeat of the previous section. Let's resolve it in a future PR, figuring out what to write is a bit annoying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see
## Debugging Dynamic Shape full graph capture
we probably should merge the two i agree
|
@ProExpertProg the failing job is a timeout in building docs is there a way to retry? |
Signed-off-by: Laith Sakka <lsakka@meta.com>
|
dummy update to force running all tests/ |
| - BACKED: Default PyTorch behavior with potential guards ignored. | ||
| - UNBACKED: No guards guaranteed (most sound) but may throw | ||
| data dependent errors. | ||
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | ||
| backed/unbacked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit to get the help text to render nicely. Should bhe done in a follow up to save CI
| - BACKED: Default PyTorch behavior with potential guards ignored. | |
| - UNBACKED: No guards guaranteed (most sound) but may throw | |
| data dependent errors. | |
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | |
| backed/unbacked. | |
| - BACKED: Default PyTorch behavior with potential guards ignored.\n | |
| - UNBACKED: No guards guaranteed (most sound) but may throw | |
| data dependent errors.\n | |
| - BACKED_SIZE_OBLIVIOUS: Experimental safer alternative to | |
| backed/unbacked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Docs failure/timeout was likely related to the Python docs (which our docs references) being down. It appears to have resolved itself |
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com> Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
…re sounds compilation. (vllm-project#26199) Signed-off-by: Laith Sakka <lsakka@meta.com>
Purpose
Dynamic shapes guards are dropped unsoundly in vLLM, those can be added during both dynamo and inductor compilations. The proper way to compile a sounds graph is to use unbacked dynamic shapes or ""backed size oblivious
maybe"" ! (see the note in the diff content). This PR adds a config that allow choosing between backed, unbacked and backed_size_oblivious as explained in the comment in the content of the PR.
When using unbacked, users might want to to provide invariants about the shapes of the Model, a lambda argument
is added to support_torch_compile decorators where users can provide a lambda that assert on invariants
on the model inputs shapes. Those are needed to avoid DDE and to be able to trace the model with unbacked.
example:
see the qwen example in the code also.
Test Plan
Added a unit test .
only works for torch. 2.10+
Perf testing
command
Qwen/Qwen2-1.5B-Instruct
will look into some of those perf issues in the future, but the long term vision with pre_compile is that this will be
a fallback mode. results anyway still better than eager.
backed
backed size oblivious
unbacked
eager