Pinkstack
/

Superthoughts-lite-v2-MOE-Llama3.2

Text Generation

Mixture of Experts

text-generation-inference

Model card Files Files and versions

Pinkstack commited on May 6, 2025

Commit

1fa9836

·

verified ·

1 Parent(s): 5a04207

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ Full precision version of Superthoughts lite v2 MoE. 3.91B parameters, 2 experts
 This is the non-experimental version of Superthoughts Lite v2. Offering better accuracy at all tasks, better performance and less looping while generating responses.
 We trained it by first creating a base model for all the experts, which was fine-tuned using GRPO techniques using *Unsloth* on top of meta-llama/Llama-3.2-1B-Instruct.
-After making the base model, we trained each potecial expert using SFT. After doing SFT, we did GRPO again. in total there are 4 experts:
 - Chat reasoning expert,
 - Math reasoning expert,
 - Code reasoning expert,

 This is the non-experimental version of Superthoughts Lite v2. Offering better accuracy at all tasks, better performance and less looping while generating responses.
 We trained it by first creating a base model for all the experts, which was fine-tuned using GRPO techniques using *Unsloth* on top of meta-llama/Llama-3.2-1B-Instruct.
+After making the base model, we trained each potential expert using SFT. After doing SFT, we did GRPO again. in total there are 4 experts:
 - Chat reasoning expert,
 - Math reasoning expert,
 - Code reasoning expert,