Commit 3e4f164
authored
[Attention] Attention head quantization strategy (#481)
* refactor
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* reduce diff
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* reduce diff
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* increase num of required observed dims
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* remove attention head
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* add tests
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* remove attn head
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* simplify
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* refactor
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* reduce diff
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* increase num of required observed dims
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* add tests
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* add tests for attn head
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* add tests
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* reduce diff
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix shapes
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix shapes
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* revert
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>1 parent 36c6fe1 commit 3e4f164
File tree
6 files changed
+79
-18
lines changed- src/compressed_tensors/quantization
- lifecycle
- tests
- test_quantization/lifecycle
6 files changed
+79
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
330 | 330 | | |
331 | 331 | | |
332 | 332 | | |
333 | | - | |
| 333 | + | |
334 | 334 | | |
335 | 335 | | |
336 | 336 | | |
| |||
Lines changed: 9 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
152 | 152 | | |
153 | 153 | | |
154 | 154 | | |
155 | | - | |
| 155 | + | |
156 | 156 | | |
157 | 157 | | |
158 | 158 | | |
| |||
234 | 234 | | |
235 | 235 | | |
236 | 236 | | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
237 | 244 | | |
238 | 245 | | |
239 | 246 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
104 | 105 | | |
105 | 106 | | |
106 | 107 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| 80 | + | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
110 | 112 | | |
111 | 113 | | |
112 | 114 | | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
113 | 118 | | |
114 | 119 | | |
115 | 120 | | |
116 | 121 | | |
| 122 | + | |
| 123 | + | |
117 | 124 | | |
118 | 125 | | |
119 | 126 | | |
| |||
134 | 141 | | |
135 | 142 | | |
136 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
137 | 147 | | |
138 | 148 | | |
139 | 149 | | |
140 | 150 | | |
| 151 | + | |
| 152 | + | |
141 | 153 | | |
142 | | - | |
143 | 154 | | |
144 | | - | |
| 155 | + | |
145 | 156 | | |
146 | 157 | | |
147 | 158 | | |
| |||
155 | 166 | | |
156 | 167 | | |
157 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
158 | 173 | | |
Lines changed: 50 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
287 | 287 | | |
288 | 288 | | |
289 | 289 | | |
290 | | - | |
| 290 | + | |
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
294 | | - | |
295 | | - | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
296 | 304 | | |
297 | 305 | | |
298 | 306 | | |
299 | | - | |
| 307 | + | |
300 | 308 | | |
301 | 309 | | |
302 | 310 | | |
303 | 311 | | |
304 | 312 | | |
305 | 313 | | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
306 | 341 | | |
307 | 342 | | |
308 | 343 | | |
309 | 344 | | |
310 | 345 | | |
311 | 346 | | |
312 | | - | |
313 | | - | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
314 | 350 | | |
315 | | - | |
316 | | - | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
317 | 354 | | |
318 | | - | |
319 | | - | |
| 355 | + | |
| 356 | + | |
320 | 357 | | |
321 | | - | |
322 | | - | |
| 358 | + | |
| 359 | + | |
323 | 360 | | |
324 | 361 | | |
325 | 362 | | |
326 | 363 | | |
327 | 364 | | |
328 | | - | |
| 365 | + | |
329 | 366 | | |
330 | 367 | | |
331 | 368 | | |
| |||
0 commit comments