Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,6 @@ tags:
|
|
| 13 |
A high-precision mathematical and algorithmic reasoning model
|
| 14 |
|
| 15 |
[](https://huggingface.co/SVECTOR-CORPORATION/Spec-T1-RL-7B)
|
| 16 |
-
|
| 17 |
|
| 18 |
## 📋 Model Card
|
| 19 |
|
|
@@ -82,6 +81,8 @@ The Spec-T1-RL-7B model demonstrates exceptional performance across reasoning be
|
|
| 82 |
| MMLU-Pro (EM) | 72.6 | 78.0 | 80.3 | 52.0 | 76.4 |
|
| 83 |
| IF-Eval (Prompt Strict) | 84.3 | 86.5 | 84.8 | 40.4 | 83.3 |
|
| 84 |
|
|
|
|
|
|
|
| 85 |
### Mathematics
|
| 86 |
|
| 87 |
| Benchmark | GPT-4o-0513 | Claude-3.5-Sonnet | OpenAI o1-mini | QwQ-32B | Spec-T1 |
|
|
|
|
| 13 |
A high-precision mathematical and algorithmic reasoning model
|
| 14 |
|
| 15 |
[](https://huggingface.co/SVECTOR-CORPORATION/Spec-T1-RL-7B)
|
|
|
|
| 16 |
|
| 17 |
## 📋 Model Card
|
| 18 |
|
|
|
|
| 81 |
| MMLU-Pro (EM) | 72.6 | 78.0 | 80.3 | 52.0 | 76.4 |
|
| 82 |
| IF-Eval (Prompt Strict) | 84.3 | 86.5 | 84.8 | 40.4 | 83.3 |
|
| 83 |
|
| 84 |
+
|
| 85 |
+
[Math Benchmarks](https://firebasestorage.googleapis.com/v0/b/svector-cloud.appspot.com/o/files%2FMath-Benchmarks.png?alt=media&token=9aad1bd6-ad89-4b8c-9ce7-5cbc2d48177e)
|
| 86 |
### Mathematics
|
| 87 |
|
| 88 |
| Benchmark | GPT-4o-0513 | Claude-3.5-Sonnet | OpenAI o1-mini | QwQ-32B | Spec-T1 |
|