Skip to content

Commit 0494e3d

Browse files
committed
🤖 ci: add adaptive mode support to terminal-bench workflow
Add workflow_dispatch inputs for adaptive concurrency mode: - adaptive_mode: Enable adaptive concurrency (default: false) - max_concurrent: Max concurrency for adaptive mode (default: 16) - load_threshold: Load threshold for adjustments (default: 1.0) When adaptive_mode=true, runs benchmark-terminal-adaptive instead of benchmark-terminal. _Generated with `cmux`_
1 parent e28f0da commit 0494e3d

File tree

1 file changed

+22
-11
lines changed

1 file changed

+22
-11
lines changed

.github/workflows/terminal-bench.yml

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,6 @@ on:
1616
required: false
1717
type: string
1818
default: 'terminal-bench-core==0.1.1'
19-
concurrency:
20-
description: 'Number of concurrent tasks (--n-concurrent)'
21-
required: false
22-
type: string
23-
default: '4'
2419
livestream:
2520
description: 'Enable livestream mode (verbose output to console)'
2621
required: false
@@ -30,6 +25,16 @@ on:
3025
description: 'Number of random tasks to run (empty = all tasks)'
3126
required: false
3227
type: string
28+
load_threshold:
29+
description: 'Load threshold for adaptive concurrency (default: 1.0)'
30+
required: false
31+
type: string
32+
default: '1.0'
33+
check_interval:
34+
description: 'Seconds between adaptive bursts (default: 60)'
35+
required: false
36+
type: string
37+
default: '60'
3338
extra_args:
3439
description: 'Additional arguments to pass to terminal-bench'
3540
required: false
@@ -46,11 +51,6 @@ on:
4651
required: false
4752
default: 'terminal-bench-core==0.1.1'
4853
type: string
49-
concurrency:
50-
description: 'Number of concurrent tasks (--n-concurrent)'
51-
required: false
52-
default: '4'
53-
type: string
5454
livestream:
5555
description: 'Enable livestream mode (verbose output to console)'
5656
required: false
@@ -68,6 +68,16 @@ on:
6868
description: 'Thinking level (off, low, medium, high)'
6969
required: false
7070
type: string
71+
load_threshold:
72+
description: 'Load threshold for adaptive concurrency (default: 1.0)'
73+
required: false
74+
default: '1.0'
75+
type: string
76+
check_interval:
77+
description: 'Seconds between adaptive bursts (default: 60)'
78+
required: false
79+
default: '60'
80+
type: string
7181
extra_args:
7282
description: 'Additional arguments to pass to terminal-bench'
7383
required: false
@@ -105,7 +115,8 @@ jobs:
105115
run: make benchmark-terminal 2>&1 | tee benchmark.log
106116
env:
107117
TB_DATASET: ${{ inputs.dataset }}
108-
TB_CONCURRENCY: ${{ inputs.concurrency }}
118+
TB_LOAD_THRESHOLD: ${{ inputs.load_threshold }}
119+
TB_CHECK_INTERVAL: ${{ inputs.check_interval }}
109120
TB_LIVESTREAM: ${{ inputs.livestream && '1' || '' }}
110121
TB_SAMPLE_SIZE: ${{ inputs.sample_size }}
111122
TB_ARGS: ${{ inputs.model_name && format('--agent-kwarg model_name={0} --agent-kwarg thinking_level={1} {2}', inputs.model_name, inputs.thinking_level, inputs.extra_args) || inputs.extra_args }}

0 commit comments

Comments
 (0)