Baby Pythias
Collection
Code used for training: https://github.com/rahasgit/baby_pythia_training • 8 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
160M parameter pythia model trained on 12M tokens (circa 10M words) of staged dialogs
Training loss: 2.6503 Validation loss: 2.5354 Test loss: 2.5394