eliebak HF Staff commited on
Commit
c07119c
·
verified ·
1 Parent(s): 52a727b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -44,7 +44,7 @@ The model is a decoder-only transformer using GQA and NoPE, it was pretrained on
44
  - **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
45
  - **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
46
 
47
- For more details refer to our blog post: TODO
48
 
49
  ### How to use
50
  The modeling code for SmolLM3 is available in transformers `v4.53.0`, so make sure to upgrade your transformers version. You can also load the model with the latest `vllm` which uses transformers as a backend.
@@ -65,7 +65,7 @@ outputs = model.generate(inputs)
65
  print(tokenizer.decode(outputs[0]))
66
  ```
67
 
68
- For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection [TODO].
69
 
70
  ## Evaluation
71
 
@@ -190,10 +190,13 @@ Evaluation results in reasoning mode for SmolLM3 and Qwen3 models:
190
  - **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
191
 
192
  ### Open resources
193
- Here is an infographic with all the training details [TODO].
194
- - The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and post-training can be found here [TODO]
195
  - The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
196
 
 
 
 
197
  ## Limitations
198
 
199
  SmolLM3 can produce text on a variety of topics, but the generated content may not always be factually accurate, logically consistent, or free from biases present in the training data. These models should be used as assistive tools rather than definitive sources of information. Users should always verify important information and critically evaluate any generated content.
 
44
  - **Long context:** Trained on 64k context and suppots up to **128k tokens** using YARN extrapolation
45
  - **Multilingual**: 6 natively supported (English, French, Spanish, German, Italian, and Portuguese)
46
 
47
+ For more details refer to our blog post: https://hf.co/blog/smollm3
48
 
49
  ### How to use
50
  The modeling code for SmolLM3 is available in transformers `v4.53.0`, so make sure to upgrade your transformers version. You can also load the model with the latest `vllm` which uses transformers as a backend.
 
65
  print(tokenizer.decode(outputs[0]))
66
  ```
67
 
68
+ For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can find quantized checkpoints in this collection (https://huggingface.co/collections/HuggingFaceTB/smollm3-686d33c1fdffe8e635317e23).
69
 
70
  ## Evaluation
71
 
 
190
  - **Post-training Framework:** [TRL](https://github.com/huggingface/trl)
191
 
192
  ### Open resources
193
+ Here is an infographic with all the training details.
194
+ - The datasets used for pretraining can be found in this [collection](https://huggingface.co/collections/HuggingFaceTB/smollm3-pretraining-datasets-685a7353fdc01aecde51b1d9) and those used in mid-training and post-training will be released in the following weeks
195
  - The training and evaluation configs and code can be found in the [huggingface/smollm](https://github.com/huggingface/smollm) repository.
196
 
197
+
198
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/651e96991b97c9f33d26bde6/1umKwihvgLPlj5_0xJ42j.png)
199
+
200
  ## Limitations
201
 
202
  SmolLM3 can produce text on a variety of topics, but the generated content may not always be factually accurate, logically consistent, or free from biases present in the training data. These models should be used as assistive tools rather than definitive sources of information. Users should always verify important information and critically evaluate any generated content.