burtenshaw HF Staff commited on
Commit
847690b
·
verified ·
1 Parent(s): 602d01e

Add Terminal-Bench evaluation result (33.4%)

Browse files
Files changed (1) hide show
  1. .eval_results/terminal_bench.yaml +11 -0
.eval_results/terminal_bench.yaml ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ - dataset:
2
+ id: harborframework/terminal-bench-2.0
3
+ task_id: terminal_bench
4
+ value: 33.4
5
+ date: '2026-01-28'
6
+ source:
7
+ url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
8
+ name: Terminal-Bench Leaderboard
9
+ user: burtenshaw
10
+ notes: "agent: Terminus 2"
11
+