Commit e20653b
committed
feat: implement chunked prefill and prefix cache for Qwen3 MoE.
1 parent 45f8a0a commit e20653b
File tree
3 files changed
+41
-11
lines changed- xllm
- core/layers/npu
- models/llm
3 files changed
+41
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
317 | 317 | | |
318 | 318 | | |
319 | 319 | | |
| 320 | + | |
| 321 | + | |
320 | 322 | | |
321 | 323 | | |
322 | 324 | | |
| |||
865 | 867 | | |
866 | 868 | | |
867 | 869 | | |
868 | | - | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
869 | 873 | | |
870 | 874 | | |
871 | 875 | | |
| |||
968 | 972 | | |
969 | 973 | | |
970 | 974 | | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
971 | 983 | | |
972 | 984 | | |
973 | 985 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
158 | 157 | | |
159 | 158 | | |
160 | | - | |
| 159 | + | |
161 | 160 | | |
162 | 161 | | |
163 | 162 | | |
| |||
248 | 247 | | |
249 | 248 | | |
250 | 249 | | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
256 | 274 | | |
257 | 275 | | |
258 | 276 | | |
| |||
0 commit comments