Commit 4ddf71d
authored
refactor: update dpsk fused_moe test [1] (#2088)
<!-- .github/pull_request_template.md -->
## π Description
Refactor fused_moe test.
Split test on model+precision.
Part [1]:
- test deepseek (kimi, lite) fp8 block-scaled fused moe
- default TP8
- PDL enabled
- MajorK weight layout
- higher tolerance and matching percentage
Next Part [2]:
- add BlockMajorK weight layout
Next Part [x]:
- Per Tensor FP8 MoE, FP4MoE
later:
- refactor llama4, topk?, renormalize? routing tests
## π Related Issues
<!-- Link any related issues here -->
## π Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### β
Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## π§ͺ Tests
- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Tests**
* Added a comprehensive FP8 block-scale fused Mixture-of-Experts test
validating end-to-end correctness across many routing, expert and
precision configurations. Includes randomized inputs,
per-token/per-expert workflows, extensive parameterizations, diagnostic
statistics, autotune-path checks, and a minimal sanity run.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent cce4952 commit 4ddf71d
1 file changed
+570
-0
lines changed
0 commit comments