Update README.md
Browse files
README.md
CHANGED
|
@@ -36,7 +36,6 @@ Advanced, high-quality and **lite** reasoning for a tiny size that you can run o
|
|
| 36 |
At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli.
|
| 37 |
|
| 38 |
Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
|
| 39 |
-
Keep in mind that code generation is very limited, superthoughts mini has significantly better code generation.
|
| 40 |
# Format
|
| 41 |
```
|
| 42 |
<|im_start|>user
|
|
|
|
| 36 |
At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli.
|
| 37 |
|
| 38 |
Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
|
|
|
|
| 39 |
# Format
|
| 40 |
```
|
| 41 |
<|im_start|>user
|