Commit fb410ee
committed
feat: implement chunked prefill and prefix cache for Qwen3 MoE.
1 parent 25e16fa commit fb410ee
File tree
3 files changed
+41
-11
lines changed- xllm
- core/layers/npu
- models/llm
3 files changed
+41
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
321 | 321 | | |
322 | 322 | | |
323 | 323 | | |
| 324 | + | |
| 325 | + | |
324 | 326 | | |
325 | 327 | | |
326 | 328 | | |
| |||
894 | 896 | | |
895 | 897 | | |
896 | 898 | | |
897 | | - | |
| 899 | + | |
| 900 | + | |
| 901 | + | |
898 | 902 | | |
899 | 903 | | |
900 | 904 | | |
| |||
997 | 1001 | | |
998 | 1002 | | |
999 | 1003 | | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
1000 | 1012 | | |
1001 | 1013 | | |
1002 | 1014 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
158 | 157 | | |
159 | 158 | | |
160 | | - | |
| 159 | + | |
161 | 160 | | |
162 | 161 | | |
163 | 162 | | |
| |||
251 | 250 | | |
252 | 251 | | |
253 | 252 | | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
259 | 277 | | |
260 | 278 | | |
261 | 279 | | |
| |||
0 commit comments