Commit 51a0b9d
authored
IPEX support FP8 kvcache/softcap/slidingwindow (#3144)
* IPEX support FP8 kvcache
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* add kvcache dtype
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* add softcap and slidingwindow
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* kv scale in pageattn
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* remove triton installation, will be installed with torch
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* install xelink lib
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* softcap default -1.0
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* softcap default -1.0
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>1 parent f208ba6 commit 51a0b9d
File tree
3 files changed
+87
-20
lines changed- server/text_generation_server/layers/attention
3 files changed
+87
-20
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
| 102 | + | |
102 | 103 | | |
103 | 104 | | |
104 | 105 | | |
| |||
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
119 | | - | |
120 | | - | |
| 120 | + | |
| 121 | + | |
121 | 122 | | |
122 | 123 | | |
123 | 124 | | |
| |||
180 | 181 | | |
181 | 182 | | |
182 | 183 | | |
183 | | - | |
184 | | - | |
| 184 | + | |
| 185 | + | |
185 | 186 | | |
186 | 187 | | |
187 | 188 | | |
188 | | - | |
189 | | - | |
| 189 | + | |
| 190 | + | |
190 | 191 | | |
191 | 192 | | |
192 | 193 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
12 | 15 | | |
13 | 16 | | |
14 | 17 | | |
| |||
25 | 28 | | |
26 | 29 | | |
27 | 30 | | |
28 | | - | |
29 | | - | |
30 | 31 | | |
31 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
32 | 38 | | |
33 | 39 | | |
34 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
35 | 44 | | |
36 | 45 | | |
37 | 46 | | |
| |||
45 | 54 | | |
46 | 55 | | |
47 | 56 | | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
48 | 63 | | |
49 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
50 | 69 | | |
51 | 70 | | |
52 | 71 | | |
| |||
80 | 99 | | |
81 | 100 | | |
82 | 101 | | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | 102 | | |
87 | | - | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
88 | 108 | | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
89 | 112 | | |
90 | 113 | | |
91 | 114 | | |
| |||
99 | 122 | | |
100 | 123 | | |
101 | 124 | | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
102 | 131 | | |
103 | 132 | | |
104 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
105 | 138 | | |
106 | 139 | | |
107 | 140 | | |
| |||
114 | 147 | | |
115 | 148 | | |
116 | 149 | | |
| 150 | + | |
| 151 | + | |
117 | 152 | | |
118 | 153 | | |
119 | 154 | | |
| |||
Lines changed: 37 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
| 72 | + | |
72 | 73 | | |
73 | 74 | | |
74 | | - | |
| 75 | + | |
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
80 | 85 | | |
81 | 86 | | |
82 | 87 | | |
| |||
133 | 138 | | |
134 | 139 | | |
135 | 140 | | |
136 | | - | |
| 141 | + | |
| 142 | + | |
137 | 143 | | |
138 | 144 | | |
139 | 145 | | |
140 | 146 | | |
141 | 147 | | |
142 | 148 | | |
143 | 149 | | |
144 | | - | |
| 150 | + | |
145 | 151 | | |
146 | 152 | | |
147 | 153 | | |
| |||
207 | 213 | | |
208 | 214 | | |
209 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
210 | 221 | | |
211 | | - | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
212 | 230 | | |
213 | 231 | | |
214 | 232 | | |
| |||
267 | 285 | | |
268 | 286 | | |
269 | 287 | | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
270 | 294 | | |
271 | | - | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
272 | 303 | | |
273 | 304 | | |
274 | 305 | | |
| |||
0 commit comments