Update README.md
Browse files
README.md
CHANGED
|
@@ -58,20 +58,18 @@ library_name: transformers
|
|
| 58 |
---
|
| 59 |
|
| 60 |
# **Evaluation**
|
| 61 |
-
Below are the evaluations of the Atlas-Pro models and Deepseek's R1 Qwen Distills (The model that started the whole Atlas family)
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
| **
|
| 67 |
-
| **
|
| 68 |
-
| **
|
| 69 |
-
| **
|
| 70 |
-
| **
|
| 71 |
-
| **
|
| 72 |
-
| **
|
| 73 |
-
| **Carbon Emissions (kg)** | 0.69 kg | 0.59 kg | 0.68 kg | 0.62 kg | **0.54 kg** |
|
| 74 |
-
| **Parameters** | ~7B | ~1.5B | ~7B | ~1.5B | ~7B |
|
| 75 |
|
| 76 |
|
| 77 |
|
|
|
|
| 58 |
---
|
| 59 |
|
| 60 |
# **Evaluation**
|
| 61 |
+
Below are the evaluations of the Atlas-Pro models and Deepseek's R1 Qwen Distills (The model that started the whole Atlas family):
|
| 62 |
+
|
| 63 |
+
| **Metric** | **Spestly Atlas Pro (7B)** | **Spestly Atlas Pro (1.5B)** | DeepSeek-R1-Distill-Qwen (7B) | DeepSeek-R1-Distill-Qwen (1.5B) |
|
| 64 |
+
|-------------------------|---------------------------|------------------------------|-----------------------------------|-------------------------------------|
|
| 65 |
+
| **Average** | **22.65%** | 12.93% | 11.73% | 7.53% |
|
| 66 |
+
| **IFEval** | 31.54% | 24.30% | **40.38%** | 34.63% |
|
| 67 |
+
| **BBH** | **25.27%** | 9.08% | 7.88% | 4.73% |
|
| 68 |
+
| **MATH** | **38.90%** | 25.83% | 0.00% | 0.00% |
|
| 69 |
+
| **GPQA** | **11.63%** | 6.26% | 3.91% | 2.97% |
|
| 70 |
+
| **MUSR** | **6.65%** | 1.86% | 3.55% | 2.08% |
|
| 71 |
+
| **MMLU-Pro** | **21.89%** | 10.28% | 14.68% | 0.78% |
|
| 72 |
+
| **Carbon Emissions (kg)** | 0.69 kg | **0.59 kg** | 0.68 kg | 0.62 kg |
|
|
|
|
|
|
|
| 73 |
|
| 74 |
|
| 75 |
|