A newer version of this model is available: qikp/kite-2.5-13m

Kite

🎉 You are looking at Kite 2, which is now trained using HuggingFaceTB/cosmopedia-100k and many other changes!

Kite is a small, trained, 13 million parameter language model, without any special optimizations.

Training

It was trained on this dataset using 12500 steps, 1 epoch, 4 batch size, 5e-4 learning rate, and the GPT-2 tokenizer.

Limitations

Due to its size, the model is not suitable for production workloads.

Loss

Downloads last month
79
Safetensors
Model size
14.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train qikp/kite-2-13m