Update README.md
Browse files
README.md
CHANGED
|
@@ -99,7 +99,7 @@ The model shows strong agentic behavior: it recovers from errors (read-before-wr
|
|
| 99 |
|
| 100 |
- **GPQA Diamond pass@1: 83.8%** (166/198). +2.1 points over the Qwen3.5-9B base model (81.7). At pass@3: **86.4** (171/198).
|
| 101 |
- **AIME 2025 pass@5: 90%** (27/30).
|
| 102 |
-
- **Terminal-Bench 2.0: 23.6%** (21/89). +8.99 points over the Qwen3.5-9B base model (14.6%, 13/89).
|
| 103 |
|
| 104 |
---
|
| 105 |
|
|
|
|
| 99 |
|
| 100 |
- **GPQA Diamond pass@1: 83.8%** (166/198). +2.1 points over the Qwen3.5-9B base model (81.7). At pass@3: **86.4** (171/198).
|
| 101 |
- **AIME 2025 pass@5: 90%** (27/30).
|
| 102 |
+
- **Terminal-Bench 2.0: 23.6%** (21/89). +8.99 points (+61% improvement) over the Qwen3.5-9B base model (14.6%, 13/89).
|
| 103 |
|
| 104 |
---
|
| 105 |
|