Update README.md
Browse files
README.md
CHANGED
|
@@ -65,7 +65,7 @@ Below are the evaluations of the Atlas-Pro models and Deepseek's R1 Qwen Distill
|
|
| 65 |
|-------------------------|---------------------------|------------------------------|-----------------------------------|-------------------------------------|-------------------------------------|
|
| 66 |
| **Average** | **22.65%** | 12.93% | 11.73% | 7.53% | 19.17% |
|
| 67 |
| **IFEval** | 31.54% | 24.30% | 40.38% | 34.63% | **54.65%** |
|
| 68 |
-
| **BBH** | 25.27
|
| 69 |
| **MATH** | **38.90%** | 25.83% | 0.00% | 0.00% | 3.55% |
|
| 70 |
| **GPQA** | **11.63%** | 6.26% | 3.91% | 2.97% | 3.91% |
|
| 71 |
| **MUSR** | **6.65%** | 1.86% | 3.55% | 2.08% | 4.30% |
|
|
|
|
| 65 |
|-------------------------|---------------------------|------------------------------|-----------------------------------|-------------------------------------|-------------------------------------|
|
| 66 |
| **Average** | **22.65%** | 12.93% | 11.73% | 7.53% | 19.17% |
|
| 67 |
| **IFEval** | 31.54% | 24.30% | 40.38% | 34.63% | **54.65%** |
|
| 68 |
+
| **BBH** | **25.27%** | 9.08% | 7.88% | 4.73% | 25.07% |
|
| 69 |
| **MATH** | **38.90%** | 25.83% | 0.00% | 0.00% | 3.55% |
|
| 70 |
| **GPQA** | **11.63%** | 6.26% | 3.91% | 2.97% | 3.91% |
|
| 71 |
| **MUSR** | **6.65%** | 1.86% | 3.55% | 2.08% | 4.30% |
|