Wanghan Xu commited on
Commit
897c202
·
verified ·
1 Parent(s): f71bb5e

Simplify ResearchClawBench eval notes

Browse files
.eval_results/researchclawbench.yaml CHANGED
@@ -3,7 +3,7 @@
3
  task_id: overall
4
  value: 13.96
5
  date: "2026-04-15"
6
- notes: "ResearchHarness: https://huggingface.co/spaces/InternScience/ResearchHarness; ResearchClawBench: https://huggingface.co/datasets/InternScience/ResearchClawBench; tools enabled; code execution; file-system workspace; completed 39/40 tasks"
7
  source:
8
  url: https://huggingface.co/moonshotai/Kimi-K2.5
9
  name: Model Card
 
3
  task_id: overall
4
  value: 13.96
5
  date: "2026-04-15"
6
+ notes: "ResearchHarness evaluation with tools enabled, code execution, and a file-system workspace; completed 39/40 ResearchClawBench tasks."
7
  source:
8
  url: https://huggingface.co/moonshotai/Kimi-K2.5
9
  name: Model Card