Update README.md
Browse files
README.md
CHANGED
|
@@ -59,23 +59,23 @@ To ensure a fair comparison, we conducted experiments with three distinct datase
|
|
| 59 |
|--------------------|----------|----------|----------|------|
|
| 60 |
| Qwen3-4B-Base | 24.6 | 25.0 | 90.4 | 44.6 |
|
| 61 |
| Qwen3-8B-Base | <u>37.9<u> | <u>29.6<u> | <u>91.1<u> | <u>48.9<u> |
|
| 62 |
-
| **
|
| 63 |
|
| 64 |
* Finetuned with Ring-lite-sft-data
|
| 65 |
| Model | AIME2024 | AIME2025 | Math-500 | GPQA |
|
| 66 |
|--------------------|----------|----------|----------|------|
|
| 67 |
| Qwen3-4B-Base | 40.4 | 31.3 | 93.6 | 51.4 |
|
| 68 |
| Qwen3-8B-Base | <u>50.0<u> | <u>35.8<u> | <u>94.4<u> | <u>55.1<u> |
|
| 69 |
-
| **
|
| 70 |
|
| 71 |
* Finetuned with OpenThoughts3
|
| 72 |
| Model | AIME2024 | AIME2025 | Math-500 | GPQA |
|
| 73 |
|--------------------|----------|----------|----------|------|
|
| 74 |
| Qwen3-4B-Base | 52.9 | 42.1 | 93.2 | 49.6 |
|
| 75 |
| Qwen3-8B-Base | <u>60.4<u> | <u>47.1<u> | **95.0** | <u>55.3<u> |
|
| 76 |
-
| **
|
| 77 |
|
| 78 |
-
The results demonstrate that **
|
| 79 |
|
| 80 |
|
| 81 |
## <span id="Inference">4. Quickstart</span>
|
|
|
|
| 59 |
|--------------------|----------|----------|----------|------|
|
| 60 |
| Qwen3-4B-Base | 24.6 | 25.0 | 90.4 | 44.6 |
|
| 61 |
| Qwen3-8B-Base | <u>37.9<u> | <u>29.6<u> | <u>91.1<u> | <u>48.9<u> |
|
| 62 |
+
| **Nanbeige4-3B-Base** | **52.9** | **40.8** | **93.4** | **53.4** |
|
| 63 |
|
| 64 |
* Finetuned with Ring-lite-sft-data
|
| 65 |
| Model | AIME2024 | AIME2025 | Math-500 | GPQA |
|
| 66 |
|--------------------|----------|----------|----------|------|
|
| 67 |
| Qwen3-4B-Base | 40.4 | 31.3 | 93.6 | 51.4 |
|
| 68 |
| Qwen3-8B-Base | <u>50.0<u> | <u>35.8<u> | <u>94.4<u> | <u>55.1<u> |
|
| 69 |
+
| **Nanbeige4-3B-Base** | **56.8** | **45.3** | **95.5** | **57.7** |
|
| 70 |
|
| 71 |
* Finetuned with OpenThoughts3
|
| 72 |
| Model | AIME2024 | AIME2025 | Math-500 | GPQA |
|
| 73 |
|--------------------|----------|----------|----------|------|
|
| 74 |
| Qwen3-4B-Base | 52.9 | 42.1 | 93.2 | 49.6 |
|
| 75 |
| Qwen3-8B-Base | <u>60.4<u> | <u>47.1<u> | **95.0** | <u>55.3<u> |
|
| 76 |
+
| **Nanbeige4-3B-Base** | **62.4** | **49.2** | <u>94.6<u> | **56.9** |
|
| 77 |
|
| 78 |
+
The results demonstrate that **Nanbeige4-3B-Base** significantly outperforms Qwen3-4B-Base, and even surpasses the larger Qwen3-8B-Base, highlighting the greater potential of our base model after fine-tuning. This advantage stems from the optimized training recipe during our Stable stage and the extensive high-quality synthetic data incorporated during the Decay stage.
|
| 79 |
|
| 80 |
|
| 81 |
## <span id="Inference">4. Quickstart</span>
|