Commit 463910e
authored
[Bugfix] use module-level import for patched function in Qwen3Next (#4354)
### What this PR does / why we need it?
**Problem**: The Qwen3Next model implementation currently imports
chunk_gated_delta_rule directly using `from ... import ...`
In frameworks like `verl`, the model file is often imported before
`vllm-ascend` initializes and applies its patches. This causes the model
to permanently hold a reference to the original (unpatched) vLLM kernel,
resulting in execution errors on Ascend devices even if the patch runs
later.
**Solution**: Changed the import style to `from vllm...ops import chunk`
and call `chunk.chunk_gated_delta_rule().`
This ensures that the function lookup happens at runtime (dynamic
dispatch), allowing the model to correctly pick up the patched function
regardless of import order.
- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b
Signed-off-by: zjchenn <zjchenn@gmail.com>1 parent 941d54a commit 463910e
1 file changed
+6
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | | - | |
| 19 | + | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | | - | |
39 | | - | |
| 37 | + | |
40 | 38 | | |
41 | 39 | | |
42 | 40 | | |
| |||
252 | 250 | | |
253 | 251 | | |
254 | 252 | | |
255 | | - | |
| 253 | + | |
256 | 254 | | |
257 | 255 | | |
258 | 256 | | |
| |||
269 | 267 | | |
270 | 268 | | |
271 | 269 | | |
272 | | - | |
| 270 | + | |
273 | 271 | | |
274 | 272 | | |
275 | 273 | | |
| |||
280 | 278 | | |
281 | 279 | | |
282 | 280 | | |
283 | | - | |
| 281 | + | |
284 | 282 | | |
285 | 283 | | |
286 | 284 | | |
| |||
364 | 362 | | |
365 | 363 | | |
366 | 364 | | |
367 | | - | |
| 365 | + | |
368 | 366 | | |
369 | 367 | | |
370 | 368 | | |
| |||
0 commit comments