Baby Pythias
Collection
Code used for training: https://github.com/rahasgit/baby_pythia_training
•
8 items
•
Updated
160M parameter pythia model trained on 12M tokens (circa 10M words) of non-dialog text, spontaneous speech and staged dialogs.
Training loss 3.22 Validation loss 3.16 Test loss 3.16