A newer version of this model is available:
qikp/kite-2.5-13m
Kite
🎉 You are looking at Kite 2, which is now trained using HuggingFaceTB/cosmopedia-100k and many other changes!
Kite is a small, trained, 13 million parameter language model, without any special optimizations.
Training
It was trained on this dataset using 12500 steps, 1 epoch, 4 batch size, 5e-4 learning rate, and the GPT-2 tokenizer.
Limitations
Due to its size, the model is not suitable for production workloads.
Loss
- Downloads last month
- 79
