-
Notifications
You must be signed in to change notification settings - Fork 589
[API change] deprecate tile_token_dim in trtllm_moe #2086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
WalkthroughThis PR removes the deprecated Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/bot run |
Summary of ChangesHello @jiahanc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request completes the deprecation process for the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
benchmarks/samples/sample_testlist_output.csvis excluded by!**/*.csv
📒 Files selected for processing (9)
benchmarks/README.md(1 hunks)benchmarks/bench_trtllm_gen_fused_moe_autotuner.py(0 hunks)benchmarks/routines/flashinfer_benchmark_utils.py(0 hunks)benchmarks/routines/moe.py(0 hunks)benchmarks/samples/sample_testlist_output.txt(6 hunks)csrc/trtllm_fused_moe_kernel_launcher.cu(0 hunks)flashinfer/fused_moe/core.py(0 hunks)tests/moe/test_trtllm_gen_fused_moe.py(0 hunks)tests/moe/test_trtllm_gen_routed_fused_moe.py(0 hunks)
💤 Files with no reviewable changes (7)
- benchmarks/routines/flashinfer_benchmark_utils.py
- csrc/trtllm_fused_moe_kernel_launcher.cu
- tests/moe/test_trtllm_gen_fused_moe.py
- tests/moe/test_trtllm_gen_routed_fused_moe.py
- benchmarks/bench_trtllm_gen_fused_moe_autotuner.py
- flashinfer/fused_moe/core.py
- benchmarks/routines/moe.py
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
benchmarks/README.md
169-169: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing
(MD056, table-column-count)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Deploy Docs
🔇 Additional comments (1)
benchmarks/samples/sample_testlist_output.txt (1)
359-359: Verify consistency of tile_tokens_dim removal across MOE samples.The sample output shows an inconsistency: TensorRT-LLM MOE test cases (lines 295–339) do not include
tile_tokens_dimin their Namespace arguments, but CUTLASS MOE test cases (lines 359, 368, 377, 386) still showtile_tokens_dim=8.Given the PR's goal to deprecate
tile_tokens_dim, clarify whether this removal is backend-specific (trtllm only) or if CUTLASS sample outputs also need updating.Also applies to: 368-368, 377-377, 386-386
| | `--local_expert_offset` | Offset of local experts in global expert space. Default: 0 | | ||
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | ||
| | `--tile_tokens_dim` | Tile dimension for tokens. Default: 8 | | ||
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix markdown table formatting on line 169.
Line 169 has extra pipe delimiters causing a column count mismatch (expected 2, actual 3). This breaks markdown table rendering.
Apply this diff to fix the formatting:
-| `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | |
+| `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | | |
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
169-169: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing
(MD056, table-column-count)
🤖 Prompt for AI Agents
In benchmarks/README.md around line 169, the markdown table row has an extra
pipe delimiter causing a 3-column row while the table expects 2 columns; remove
the surplus pipe(s) so the row contains exactly two cells (option name and
description) and ensure the row follows the same leading/trailing pipe
convention as the table header and other rows for consistent rendering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code rabbit, seems false alarm as that last "|" was extra in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aleozlx You're absolutely right! This is definitely not a false alarm. The extra | and trailing spaces were accidentally introduced in this PR when the tile_tokens_dim row was removed.
The original line had the correct 2-column format:
| `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts |But the new version has an extra pipe creating a malformed 3-column row:
| `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | |The suggested fix removes that extra content to restore proper markdown table formatting. Please apply the suggestion to fix the table rendering. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiahanc i misread the suggestion as the code diff earlier, this seems a real syntax error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively deprecates the tile_token_dim parameter from the trtllm_moe API. The changes are applied consistently across the codebase, including the Python API in flashinfer/fused_moe/core.py, benchmark scripts, tests, and documentation. The removal of the parameter and its associated logic simplifies the code and aligns with the stated goal. I've found one minor formatting issue in the documentation that should be addressed.
| | `--local_expert_offset` | Offset of local experts in global expert space. Default: 0 | | ||
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | ||
| | `--tile_tokens_dim` | Tile dimension for tokens. Default: 8 | | ||
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a formatting issue in this line. It has a lot of trailing whitespace and an extra pipe character at the end. This was likely introduced accidentally when removing the --tile_tokens_dim line.
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | | | |
| | `--local_num_experts` | Number of experts handled by this device. Default: equals num_experts | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed, looks good
yzh119
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For next release, it should be v0.6.0, if we tag v0.5.3, the release branch should skip this commit.
<!-- .github/pull_request_template.md --> ## 📌 Description Deprecate `tile_token_dim` in trtllm_moe. It is already not used and mark with deprecation warning, plan to deprecate totally in next major release <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Refactor** * Removed the deprecated `tile_tokens_dim` parameter from MOE benchmarks and kernel functions, streamlining API calls and eliminating associated deprecation warnings. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
📌 Description
Deprecate
tile_token_dimin trtllm_moe. It is already not used and mark with deprecation warning, plan to deprecate totally in next major release🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit
tile_tokens_dimparameter from MOE benchmarks and kernel functions, streamlining API calls and eliminating associated deprecation warnings.