SeaWolf-AI commited on
Commit
122f43f
·
verified ·
1 Parent(s): aa76bb8

Add GPQA Diamond eval result (89.39, Darwin-DELPHI)

Browse files
Files changed (1) hide show
  1. .eval_results/gpqa_diamond.yaml +9 -0
.eval_results/gpqa_diamond.yaml ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ - dataset:
2
+ id: Idavidrein/gpqa
3
+ task_id: diamond
4
+ value: 89.39
5
+ date: "2026-05-17"
6
+ source:
7
+ url: https://huggingface.co/FINAL-Bench/Darwin-28B-REASON
8
+ name: Darwin-28B-REASON Benchmark (Darwin-DELPHI)
9
+ user: vidraft