-
Notifications
You must be signed in to change notification settings - Fork 75
[SGLANG] [Benchmarks] Initial integration of sglang kernels to benchmarks #3796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b43b5e2 to
bbd10a8
Compare
34e464a to
29e2711
Compare
29e2711 to
4b7241c
Compare
1296327 to
7d4d837
Compare
9651ebe to
e609f5d
Compare
@etiotto Yes, these 2 depend on the new agama release. |
48b96ec to
458e06d
Compare
|
The third party benchmark passed here https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/15210819443/job/42784278091 |
|
@Egor-Krivov @vlad-penkin Please take another look. |
10da795 to
ead512a
Compare
|
@vlad-penkin If there is no more mature plan for integrating third-party benchmarks in short time, what do you think of merging this PR for now? It's been in a good shape and uses the same logic as existing Liger-Kernel benchmarks. |
0ecf2a6 to
e1cb11b
Compare
I agree with merging this PR first, and have a separate PR for refactoring and restructuring, so that we can start testing sglang benchmark. @vlad-penkin WDYT? |
|
@vlad-penkin ping. |
Port prefill attn and decode attn from sglang Add validation temp add extend attention disable debug ir dump Update three stage attention benchmark Add sglang kernel benchmark to action use 1e-3 atol remove sglang benchmark from triton-benchmarks Fix setup bdist_wheel Add sglang to thirdparty test Address review comments Remove sglang from tests Fix CI Address review comments Integrate sglang prefill/decode/extend kernel to benchmarks Port prefill attn and decode attn from sglang Add validation temp add extend attention disable debug ir dump Update three stage attention benchmark Add sglang kernel benchmark to action use 1e-3 atol remove sglang benchmark from triton-benchmarks Fix setup bdist_wheel Add sglang to thirdparty test Address review comments Remove sglang from tests Adjust params term Adjust tflops computation
fix bugs rtol atol Move fp8 gemm to sglang benchmark
Address review comments Fix CI XPU not found
e1cb11b to
a7a69e2
Compare
a7a69e2 to
c61d0ea
Compare
|
@vlad-penkin anything concerned for merging this PR? The PR is critical for us to maintain SGLang. cc @mingfeima |
|
@Egor-Krivov please check if anything left in this PR that should be merged. |
|
@whitneywhtsang |
|
According to the comment, should be from SGLang. |
The initial enabling for sglang benchmarks.
Include sglang prefill/decode/extended attention and fp8 quant gemm into third-party benchmark.