Skip to content

Commit 0ecf2a6

Browse files
authored
Merge branch 'main' into liyang/init_sglang_benchmark
2 parents ead512a + 6b2fa6c commit 0ecf2a6

File tree

62 files changed

+2019
-563
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+2019
-563
lines changed

.github/workflows/build-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ jobs:
9292
run: |
9393
pip install pytest pytest-xdist defusedxml
9494
cd scripts
95-
pytest -n 4 test_*.py
95+
pytest -v -n 4 test_*.py
9696
9797
- name: Save pip cache
9898
if: ${{ steps.pip-cache.outputs.status == 'miss' }}

.github/workflows/ci.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
name: Integration Tests
22
on:
33
workflow_dispatch:
4+
pull_request:
5+
branches-ignore: ['llvm-**']
6+
merge_group:
7+
branches: [main, 'dev-**']
8+
types: [checks_requested]
9+
push:
10+
branches: [main]
411
concurrency:
512
group: ${{ github.ref }}
613
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

.github/workflows/llvm-build.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ on:
66
- llvm-head
77
paths:
88
- cmake/llvm-hash.txt
9+
pull_request:
10+
paths:
11+
- .github/workflows/llvm-build.yml
912
workflow_dispatch:
1013

1114
env:
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
name: Triton benchmarks, BMG
2+
3+
on:
4+
pull_request:
5+
branches:
6+
- main
7+
paths:
8+
- .github/workflows/triton-benchmarks*.yml
9+
- benchmarks/**
10+
11+
jobs:
12+
benchmarks:
13+
uses: ./.github/workflows/triton-benchmarks.yml
14+
with:
15+
runner_label: b580
16+
skip_benchmarks: "['flash_attention_bwd_benchmark.py','flex_attention_benchmark_custom_masks.py']"
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
name: Triton benchmarks, PVC
2+
3+
on:
4+
pull_request:
5+
branches:
6+
- main
7+
paths:
8+
- .github/workflows/triton-benchmarks*.yml
9+
- benchmarks/**
10+
11+
jobs:
12+
benchmarks:
13+
uses: ./.github/workflows/triton-benchmarks.yml
14+
with:
15+
runner_label: max1550
16+
skip_benchmarks: "[]"

.github/workflows/triton-benchmarks.yml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,12 +44,16 @@ on:
4444
type: boolean
4545
default: false
4646

47-
pull_request:
48-
branches:
49-
- main
50-
paths:
51-
- .github/workflows/triton-benchmarks.yml
52-
- benchmarks/**
47+
# This workflow is also called from workflows triton-benchmarks-*.yml.
48+
workflow_call:
49+
inputs:
50+
runner_label:
51+
description: Runner label
52+
type: string
53+
skip_benchmarks:
54+
description: JSON list of benchmarks to skip
55+
type: string
56+
default: "[]"
5357

5458
# Cancels in-progress PR runs when the PR is updated. Manual runs are never cancelled.
5559
concurrency:

.github/workflows/wheels.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
name: Wheels
22
on:
33
workflow_dispatch:
4+
pull_request:
5+
paths:
6+
- .github/workflows/wheels.yml
47
schedule:
58
- cron: "0 8 * * *"
69

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ llvm-project-*/
88

99
# Triton Python module builds
1010
dist/
11-
triton*.egg-info/
11+
python/triton*.egg-info/
1212
*.whl
1313
python/triton_kernels/triton*.egg-info/
1414

include/triton/Dialect/TritonGPU/IR/LinearLayoutConversions.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,10 @@ LinearLayout chooseDsReadB64TrLayout(Attribute enc, ArrayRef<int64_t> shape,
271271
LinearLayout getScaleTMEMStoreLinearLayout(RankedTensorType scaleType,
272272
int numWarps);
273273

274+
std::optional<LinearLayout>
275+
getTmemLoadStoreLayout16x256(int M, int N, RankedTensorType oldType,
276+
int numWarps);
277+
274278
// Return a layout valid for TMemLoad op for a tmem layout of block MxN that
275279
// distribute the data long M for the warp groups. This doesn't affect the TMem
276280
// layout it just returns a distributed layout compatible for tmem_load.

include/triton/Tools/Sys/GetEnv.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ inline const std::set<std::string> CACHE_INVALIDATING_ENV_VARS = {
4444
"STORE_TMEM_TO_GLOBAL_BYPASS_SMEM",
4545
"ALLOW_LHS_TMEM_LAYOUT_CONVERSION",
4646
"TRITON_F32_DEFAULT",
47+
"TRITON_PREFER_TMEM_16x256_LAYOUT",
4748
"TRITON_INTEL_ADVANCED_PATH",
4849
"TRITON_INTEL_AGGRESSIVE_DPAS_REUSE",
4950
"TRITON_INTEL_DO_NOT_SINK_INSTR_ACROSS_RGN",

0 commit comments

Comments
 (0)