SmolGuru-80M-512
This public repo is being prepared for an ~80M parameter LLaMA-style chatbot trained from scratch with the HuggingFaceTB/SmolLM2-135M-Instruct tokenizer.
Status: scaffold/config repo created. Trained weights will be uploaded after pretraining + SFT complete.
Planned training:
- 2B high-grade pretraining tokens
- 75k filtered SFT examples
- 512 context
- 256-token answer target
- Final + best checkpoint retained
- Downloads last month
- 1,182