Baby Pythias
Collection
Code used for training: https://github.com/rahasgit/baby_pythia_training
•
8 items
•
Updated
160M parameter pythia model trained on 12M tokens (circa 10M words) of staged dialogs
Training loss: 2.6503 Validation loss: 2.5354 Test loss: 2.5394