Commit 2f08aa3
[None][feat] AutoDeploy: Perf improvement for mamba layers (NVIDIA#8991)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>1 parent 68e98ed commit 2f08aa3
File tree
2 files changed
+7
-13
lines changed- tensorrt_llm/_torch/auto_deploy/custom_ops/mamba
2 files changed
+7
-13
lines changedLines changed: 5 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
204 | 204 | | |
205 | 205 | | |
206 | 206 | | |
207 | | - | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
208 | 210 | | |
209 | 211 | | |
210 | 212 | | |
| |||
296 | 298 | | |
297 | 299 | | |
298 | 300 | | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
| 301 | + | |
| 302 | + | |
Lines changed: 2 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
355 | 355 | | |
356 | 356 | | |
357 | 357 | | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
| 358 | + | |
| 359 | + | |
0 commit comments