Skip to content

Commit f201d10

Browse files
committed
add TODO.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
1 parent d87042d commit f201d10

File tree

1 file changed

+1
-0
lines changed
  • tensorrt_llm/_torch/attention_backend/sparse

1 file changed

+1
-0
lines changed

tensorrt_llm/_torch/attention_backend/sparse/dsa.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,7 @@ def __post_init__(self):
432432
dtype=torch.int32,
433433
capture_graph=capture_graph,
434434
)
435+
# TODO: remove these expanded buffers when fp8_paged_mqa_logits supports MTP > 1.
435436
self.kv_lens_expanded_cuda = self.get_empty(
436437
self.cuda_graph_buffers,
437438
(self.max_num_sequences * (1 + self.max_draft_tokens), ),

0 commit comments

Comments
 (0)