eliebak HF Staff commited on
Commit
90486a3
·
verified ·
1 Parent(s): f17cc5c
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -42,7 +42,7 @@ The model is a decoder-only transformer using GQA and NoPE (with 3:1 ratio), it
42
  - **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
43
  - **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
44
 
45
- For more details refer to our blog post: TODO
46
 
47
  ## How to use
48
 
@@ -181,7 +181,7 @@ text = tokenizer.apply_chat_template(
181
  )
182
  ```
183
 
184
- For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection [TODO].
185
 
186
  ### vLLM and SGLang
187
 
@@ -338,10 +338,11 @@ The model has also been trained on Arabic (standard), Chinese and Russian data,
338
  - **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
339
 
340
  ### Open resources
341
- Here is an infographic with all the training details [TODO].
342
- - The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and pos-training can be found here [TODO]
343
  - The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
344
 
 
345
 
346
  ## Limitations
347
 
 
42
  - **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
43
  - **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
44
 
45
+ For more details refer to our blog post: https://hf.co/blog/smollm3
46
 
47
  ## How to use
48
 
 
181
  )
182
  ```
183
 
184
+ For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection (https://huggingface.co/collections/HuggingFaceTB/smollm3-686d33c1fdffe8e635317e23)
185
 
186
  ### vLLM and SGLang
187
 
 
338
  - **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
339
 
340
  ### Open resources
341
+ Here is an infographic with all the training details
342
+ - The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and post-training will be uploaded later
343
  - The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
344
 
345
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/651e96991b97c9f33d26bde6/qiE5ZYr9SD1CIAtfEfuC8.png)
346
 
347
  ## Limitations
348