fix todo
Browse files
README.md
CHANGED
|
@@ -42,7 +42,7 @@ The model is a decoder-only transformer using GQA and NoPE (with 3:1 ratio), it
|
|
| 42 |
- **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
|
| 43 |
- **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
|
| 44 |
|
| 45 |
-
For more details refer to our blog post:
|
| 46 |
|
| 47 |
## How to use
|
| 48 |
|
|
@@ -181,7 +181,7 @@ text = tokenizer.apply_chat_template(
|
|
| 181 |
)
|
| 182 |
```
|
| 183 |
|
| 184 |
-
For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection
|
| 185 |
|
| 186 |
### vLLM and SGLang
|
| 187 |
|
|
@@ -338,10 +338,11 @@ The model has also been trained on Arabic (standard), Chinese and Russian data,
|
|
| 338 |
- **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
|
| 339 |
|
| 340 |
### Open resources
|
| 341 |
-
Here is an infographic with all the training details
|
| 342 |
-
- The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and
|
| 343 |
- The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
|
| 344 |
|
|
|
|
| 345 |
|
| 346 |
## Limitations
|
| 347 |
|
|
|
|
| 42 |
- **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
|
| 43 |
- **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
|
| 44 |
|
| 45 |
+
For more details refer to our blog post: https://hf.co/blog/smollm3
|
| 46 |
|
| 47 |
## How to use
|
| 48 |
|
|
|
|
| 181 |
)
|
| 182 |
```
|
| 183 |
|
| 184 |
+
For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection (https://huggingface.co/collections/HuggingFaceTB/smollm3-686d33c1fdffe8e635317e23)
|
| 185 |
|
| 186 |
### vLLM and SGLang
|
| 187 |
|
|
|
|
| 338 |
- **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
|
| 339 |
|
| 340 |
### Open resources
|
| 341 |
+
Here is an infographic with all the training details
|
| 342 |
+
- The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and post-training will be uploaded later
|
| 343 |
- The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
|
| 344 |
|
| 345 |
+

|
| 346 |
|
| 347 |
## Limitations
|
| 348 |
|