Update README.md
Browse files
README.md
CHANGED
|
@@ -120,7 +120,7 @@ model-index:
|
|
| 120 |
|
| 121 |

|
| 122 |
|
| 123 |
-
SmolTulu-
|
| 124 |
|
| 125 |
This model scores the highest current score in both IFEval and GSM8k while maintaining the extremely low contamination levels in Tulu 3 and SmolLM2! I've listed the datasets used to do both the SFT (supervised finetuning) and DPO (direct preference optimization) stages.
|
| 126 |
|
|
|
|
| 120 |
|
| 121 |

|
| 122 |
|
| 123 |
+
SmolTulu-1.7b-Instruct is the first model in a series of models meant to leverage [AllenAI's Tulu 3 post-training pipeline](https://allenai.org/blog/tulu-3-technical) to tune the [base version of Huggingface's SmolLM2-1.7b](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B)! The post training pipeline AllenAI came up with seemed like something perfect to apply here.
|
| 124 |
|
| 125 |
This model scores the highest current score in both IFEval and GSM8k while maintaining the extremely low contamination levels in Tulu 3 and SmolLM2! I've listed the datasets used to do both the SFT (supervised finetuning) and DPO (direct preference optimization) stages.
|
| 126 |
|