Skip to content

Commit ecdbdde

Browse files
committed
include task id in error output
1 parent 161ecdd commit ecdbdde

File tree

3 files changed

+10
-6
lines changed

3 files changed

+10
-6
lines changed

.agents/base2/base2-gpt-5-planner.ts

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,8 @@ The user asks you to implement a new feature. You respond in multiple steps:
4242
1a. Read all the relevant files using the read_files tool.
4343
2. Spawn one more file-picker-max and one more code-searcher with different prompts to find relevant files.
4444
2a. Read all the relevant files using the read_files tool.
45-
3. Spawn a base2-gpt-5 agent inline (with spawn_agent_inline tool) to generate a plan for the changes.
46-
4. Gather any additional context you need with sub-agents and the read_files tool.
47-
5. Create a plan for the changes, but do not implement it yet!
45+
3. Gather any additional context you need with sub-agents and the read_files tool.
46+
4. Write out a plan for the changes, but do not implement it yet!
4847
4948
For your plan:
5049
- You do not have access to tools to modify files (e.g. the write_file or str_replace tools). You are describing changes that should be made or actions that should be taken.

evals/buffbench/agent-runner.ts

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,17 @@ export async function runAgentOnCommit({
5555
agentDefinitions: localAgentDefinitions,
5656
cwd: repoDir,
5757
handleEvent: (event) => {
58-
if (event.type === 'tool_call' && event.toolName === 'set_messages') {
58+
if (
59+
(event.type === 'tool_call' || event.type === 'tool_result') &&
60+
event.toolName === 'set_messages'
61+
) {
5962
return
6063
}
6164
if (event.type === 'error') {
62-
console.error(`[${agentId}] Error event:`, event.message)
65+
console.error(
66+
`[${commit.id}:${agentId}] Error event:`,
67+
event.message,
68+
)
6369
}
6470
trace.push(event)
6571
},

evals/buffbench/run-buffbench.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -288,7 +288,6 @@ export async function runBuffBench(options: {
288288

289289
const logFiles = fs.readdirSync(logsDir)
290290

291-
console.log('\n=== Running Meta-Analysis ===')
292291
const metaAnalysis = await analyzeAllTasks({
293292
client,
294293
logsDir,

0 commit comments

Comments
 (0)