Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -44,7 +44,7 @@ model-index:
|
|
| 44 |
metrics:
|
| 45 |
- name: Pass Rate
|
| 46 |
type: accuracy
|
| 47 |
-
value: 28.
|
| 48 |
---
|
| 49 |
|
| 50 |
<div align="center">
|
|
@@ -91,11 +91,11 @@ The model shows strong agentic behavior: it recovers from errors (read-before-wr
|
|
| 91 |
| **AIME 2025** (pass@5) | 90 | | | | 91.7 | 91.6 | | |
|
| 92 |
| **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | | | 73 |
|
| 93 |
| **GPQA Diamond** (pass@3) | **86.4** | | | | | | | |
|
| 94 |
-
| **Terminal-Bench 2.0** | **28** | 20 | | | | | 33.4 | 27 |
|
| 95 |
|
| 96 |
</div>
|
| 97 |
|
| 98 |
-
> OmniCoder-9B achieves **83.8** on GPQA Diamond pass@1 (vs Qwen3.5-9B's 81.7), **86.4** at pass@3, and **28** on Terminal-Bench 2.0 (vs base model's 20, a 40% improvement).
|
| 99 |
|
| 100 |
---
|
| 101 |
|
|
|
|
| 44 |
metrics:
|
| 45 |
- name: Pass Rate
|
| 46 |
type: accuracy
|
| 47 |
+
value: 28.1
|
| 48 |
---
|
| 49 |
|
| 50 |
<div align="center">
|
|
|
|
| 91 |
| **AIME 2025** (pass@5) | 90 | | | | 91.7 | 91.6 | | |
|
| 92 |
| **GPQA Diamond** (pass@1) | **83.8** | 81.7 | 77.2 | 80.1 | 71.5 | | | 73 |
|
| 93 |
| **GPQA Diamond** (pass@3) | **86.4** | | | | | | | |
|
| 94 |
+
| **Terminal-Bench 2.0** | **28.1** | 20 | | | | | 33.4 | 27 |
|
| 95 |
|
| 96 |
</div>
|
| 97 |
|
| 98 |
+
> OmniCoder-9B achieves **83.8** on GPQA Diamond pass@1 (vs Qwen3.5-9B's 81.7), **86.4** at pass@3, and **28.1** on Terminal-Bench 2.0 (vs base model's 20, a 40% improvement).
|
| 99 |
|
| 100 |
---
|
| 101 |
|