nanochat-d20

Training Pipeline

1.Base-training PreTraining on FineWeb-EDU dataset using nanochat framework

  1. Mid-training: General instruction tuning on SmolTalk, MMLU, GSM8K, Spelling tasks

  2. SFT (Supervised Fine-Tuning): Chat-specific training on ARC, GSM8K, SmolTalk

  3. RL (Reinforcement Learning): Optional GRPO-style training on GSM8K (if included)

Repository Structure

β”œβ”€β”€ tokenizer/
β”‚   β”œβ”€β”€ tokenizer.pkl          # Tokenizer
β”‚   └── token_bytes.pt         # Token byte mappings
β”œβ”€β”€ mid_checkpoints/d34/       # Mid-training checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatsft_checkpoints/d20/   # SFT checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatsft_checkpoints_int8/d20/   # SFT checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatrl_checkpoints/d20/    # RL checkpoint (if available)
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ report/                    # Evaluation reports
β”‚   └── report.md
└── logs/                      # Training logs

License

MIT License (same as nanochat)

Acknowledgments

@misc{nanochat,
  author = {Andrej Karpathy},
  title = {nanochat: The best ChatGPT that $100 can buy},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/karpathy/nanochat}
}
  • The nanochat community
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train pankajmathur/nanochat-d20