Update README.md
Browse files
README.md
CHANGED
|
@@ -69,14 +69,14 @@ For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can f
|
|
| 69 |
|
| 70 |
## Evaluation
|
| 71 |
|
| 72 |
-
In this section, we report the evaluation results of SmolLM3 model. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them.
|
| 73 |
|
| 74 |
We highlight the best score in bold and underline the second-best score.
|
| 75 |
|
| 76 |
### Base Pre-Trained Model
|
| 77 |
|
| 78 |
#### English benchmarks
|
| 79 |
-
Note: All evaluations are zero-shot unless stated otherwise.
|
| 80 |
|
| 81 |
| Category | Metric | SmolLM3-3B | Qwen2.5-3B | Llama3-3.2B | Qwen3-1.7B-Base | Qwen3-4B-Base |
|
| 82 |
|---------|--------|---------------------|------------|--------------|------------------|---------------|
|
|
|
|
| 69 |
|
| 70 |
## Evaluation
|
| 71 |
|
| 72 |
+
In this section, we report the evaluation results of SmolLM3 model. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them.
|
| 73 |
|
| 74 |
We highlight the best score in bold and underline the second-best score.
|
| 75 |
|
| 76 |
### Base Pre-Trained Model
|
| 77 |
|
| 78 |
#### English benchmarks
|
| 79 |
+
Note: All evaluations are zero-shot unless stated otherwise. For Ruler 64k evaluation, we apply YaRN to the Qwen models with 32k context to extrapolate the context length.
|
| 80 |
|
| 81 |
| Category | Metric | SmolLM3-3B | Qwen2.5-3B | Llama3-3.2B | Qwen3-1.7B-Base | Qwen3-4B-Base |
|
| 82 |
|---------|--------|---------------------|------------|--------------|------------------|---------------|
|