burtenshaw HF Staff commited on
Commit
00c6b9d
·
verified ·
1 Parent(s): 12c7a75

Fix task_id to diamond (matching benchmark eval.yaml)

Browse files
Files changed (1) hide show
  1. .eval_results/gpqa.yaml +2 -1
.eval_results/gpqa.yaml CHANGED
@@ -1,8 +1,9 @@
1
  - dataset:
2
  id: Idavidrein/gpqa
3
- task_id: gpqa_diamond
4
  value: 71.5
5
  date: '2026-01-27'
6
  source:
7
  url: https://huggingface.co/deepseek-ai/DeepSeek-R1
8
  name: Model Card
 
 
1
  - dataset:
2
  id: Idavidrein/gpqa
3
+ task_id: diamond
4
  value: 71.5
5
  date: '2026-01-27'
6
  source:
7
  url: https://huggingface.co/deepseek-ai/DeepSeek-R1
8
  name: Model Card
9
+ user: burtenshaw