hlnchen commited on
Commit
c1b7cc8
·
verified ·
1 Parent(s): 61be7e7

Align task_ids to dataset: overall -> chi_bench (per-domain ids unchanged)

Browse files
Files changed (1) hide show
  1. .eval_results/chi-bench.yaml +1 -1
.eval_results/chi-bench.yaml CHANGED
@@ -5,7 +5,7 @@
5
  # (Ties Hermes at 18.7 overall; OAI Agents wins on reliability pass^3 12.0 vs 10.7.)
6
  - dataset:
7
  id: actava/chi-bench
8
- task_id: overall
9
  value: 18.7
10
  date: "2026-05-08"
11
  source:
 
5
  # (Ties Hermes at 18.7 overall; OAI Agents wins on reliability pass^3 12.0 vs 10.7.)
6
  - dataset:
7
  id: actava/chi-bench
8
+ task_id: chi_bench
9
  value: 18.7
10
  date: "2026-05-08"
11
  source: