smirki commited on
Commit
8950614
·
verified ·
1 Parent(s): fde9b80

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -44,7 +44,7 @@ model-index:
44
  metrics:
45
  - name: Pass Rate
46
  type: accuracy
47
- value: 28.0
48
  ---
49
 
50
  <div align="center">
@@ -91,11 +91,11 @@ The model shows strong agentic behavior: it recovers from errors (read-before-wr
91
  | **AIME 2025** (pass@5) | 90 | | | | 91.7 | 91.6 | | |
92
  | **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | | | 73 |
93
  | **GPQA Diamond** (pass@3) | **86.4** | | | | | | | |
94
- | **Terminal-Bench 2.0** | **28** | 20 | | | | | 33.4 | 27 |
95
 
96
  </div>
97
 
98
- > OmniCoder-9B achieves **83.8** on GPQA Diamond pass@1 (vs Qwen3.5-9B's 81.7), **86.4** at pass@3, and **28** on Terminal-Bench 2.0 (vs base model's 20, a 40% improvement).
99
 
100
  ---
101
 
 
44
  metrics:
45
  - name: Pass Rate
46
  type: accuracy
47
+ value: 28.1
48
  ---
49
 
50
  <div align="center">
 
91
  | **AIME 2025** (pass@5) | 90 | | | | 91.7 | 91.6 | | |
92
  | **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | | | 73 |
93
  | **GPQA Diamond** (pass@3) | **86.4** | | | | | | | |
94
+ | **Terminal-Bench 2.0** | **28.1** | 20 | | | | | 33.4 | 27 |
95
 
96
  </div>
97
 
98
+ > OmniCoder-9B achieves **83.8** on GPQA Diamond pass@1 (vs Qwen3.5-9B's 81.7), **86.4** at pass@3, and **28.1** on Terminal-Bench 2.0 (vs base model's 20, a 40% improvement).
99
 
100
  ---
101