Commit 8b03c85
Speedup online convert (#1772)
Porting the left part of #1505
- GLM-4.5-Air-FP8( ~100B), 600s -> 60.50 s
- For DS R1(~600B), it requires about 430s w/ that fix
@czhu15 @yangulei Please help to review, thanks!
cc @thuang6
---------
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Co-authored-by: ranzhejiang <zhejiang.ran@intel.com>
Co-authored-by: Yi <yi4.liu4@intel.com>1 parent 0cd2bc6 commit 8b03c85
File tree
7 files changed
+86
-85
lines changed- vllm/model_executor
- layers/quantization
- compressed_tensors
- model_loader
- models
7 files changed
+86
-85
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
37 | 36 | | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
149 | | - | |
150 | | - | |
| 150 | + | |
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| |||
Lines changed: 0 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | 7 | | |
9 | 8 | | |
10 | 9 | | |
| |||
213 | 212 | | |
214 | 213 | | |
215 | 214 | | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
228 | 228 | | |
229 | 229 | | |
230 | 230 | | |
231 | | - | |
| 231 | + | |
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
| |||
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
315 | | - | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | | - | |
321 | | - | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | 315 | | |
335 | 316 | | |
336 | 317 | | |
| |||
541 | 522 | | |
542 | 523 | | |
543 | 524 | | |
544 | | - | |
| 525 | + | |
545 | 526 | | |
546 | 527 | | |
547 | 528 | | |
| |||
662 | 643 | | |
663 | 644 | | |
664 | 645 | | |
665 | | - | |
666 | | - | |
667 | | - | |
668 | | - | |
669 | | - | |
670 | | - | |
671 | | - | |
672 | | - | |
673 | | - | |
674 | | - | |
675 | | - | |
676 | | - | |
677 | | - | |
678 | | - | |
679 | | - | |
680 | | - | |
681 | | - | |
682 | | - | |
683 | 646 | | |
684 | 647 | | |
685 | 648 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
788 | 789 | | |
789 | 790 | | |
790 | 791 | | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| |||
781 | 781 | | |
782 | 782 | | |
783 | 783 | | |
| 784 | + | |
784 | 785 | | |
785 | 786 | | |
786 | 787 | | |
| |||
796 | 797 | | |
797 | 798 | | |
798 | 799 | | |
799 | | - | |
800 | | - | |
801 | | - | |
802 | | - | |
803 | | - | |
804 | | - | |
805 | 800 | | |
806 | 801 | | |
807 | 802 | | |
| |||
873 | 868 | | |
874 | 869 | | |
875 | 870 | | |
876 | | - | |
877 | | - | |
878 | | - | |
879 | | - | |
880 | 871 | | |
881 | 872 | | |
882 | 873 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
52 | | - | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
53 | 56 | | |
54 | 57 | | |
55 | 58 | | |
| |||
130 | 133 | | |
131 | 134 | | |
132 | 135 | | |
133 | | - | |
134 | | - | |
135 | 136 | | |
136 | 137 | | |
137 | | - | |
| 138 | + | |
138 | 139 | | |
139 | 140 | | |
140 | 141 | | |
| |||
161 | 162 | | |
162 | 163 | | |
163 | 164 | | |
164 | | - | |
| 165 | + | |
165 | 166 | | |
166 | 167 | | |
167 | | - | |
168 | 168 | | |
169 | 169 | | |
170 | 170 | | |
| |||
386 | 386 | | |
387 | 387 | | |
388 | 388 | | |
389 | | - | |
| 389 | + | |
390 | 390 | | |
391 | 391 | | |
392 | 392 | | |
| |||
673 | 673 | | |
674 | 674 | | |
675 | 675 | | |
| 676 | + | |
676 | 677 | | |
677 | 678 | | |
678 | 679 | | |
| |||
0 commit comments