Baby Pythias
Collection
Code used for training: https://github.com/rahasgit/baby_pythia_training • 8 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
160M parameter pythia model trained on 12M tokens (circa 10M words) of spontaneous speech.
Training loss: 2.688
Validation loss: 2.54
Test loss: 2.497