Add 2MB val text sample for Gemma/HF tokenizer notebooks 588ff7f verified LisaMegaWatts commited on 2 days ago
Add 20MB raw text sample for Gemma/HF tokenizer notebooks d9422a5 verified LisaMegaWatts commited on 2 days ago
Distillation test winner (scratch, PPL=43.9, 5M params) 49d0c4c verified LisaMegaWatts commited on 2 days ago
Upload SymbioGPT-10M teacher (val_ppl=35.3, 13400 steps, A100) 06b4943 verified LisaMegaWatts commited on 2 days ago
Add curated training tokens (266M tokens, Chinchilla-optimal) ab3f5b8 verified LisaMegaWatts commited on 3 days ago