Commit b4c8be5
committed
[compile] Add fallback path to AOT compile when serialization fails.
Summary:
Fixing issue #27348
For dynamo caching, it's possible that the compilation succeeds but
the serialization step fails. In this case, the failure of serialization
step shouldn't block user from getting compilation results correctly.
Therefore we add a handling of the serialization error and only
give warning when model saving fails. When saving fails, VLLM model
runner should be able to just fallback to the old path, and in the
next process, it will fail to load dynamo cache but still fallback
to retracing with dynamo + loading inductor cache, which is the same
behavior to AOT compile turned of off.
This is mostly a short term fix and in the long term we should resolve
the serialization bugs by eliminating pickling of graph modules.
i.e. Once #25205 is merged,
we should be able to resolve the issue at a lower level.
Test Plan:
pytest tests/lora/test_quant_model.py
Reviewers:
Subscribers:
Tasks:
Tags:
Signed-off-by: zhxchen17 <zhxchen17@fb.com>1 parent 4d0f266 commit b4c8be5
1 file changed
+11
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
402 | 402 | | |
403 | 403 | | |
404 | 404 | | |
405 | | - | |
406 | | - | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
407 | 416 | | |
408 | 417 | | |
409 | 418 | | |
| |||
0 commit comments