Text Generation
Transformers
Safetensors
English
gpt2
conversational
text-generation-inference
A newer version of this model is available: qikp/hummingbird-3-110m

Hummingbird

🎉 You are looking at Hummingbird 2, trained on a much more efficient corpus, achieving similar performance with 3x less parameters!

Hummingbird is a GPT-2 derivative trained to be conversational.

Training

The model was trained using the paged_adamw_8bit optimizer, gradient checkpointing, 500 steps, 1 batch size, and 4 gradient accumulation steps.

Datasets

The training corpus is made up of:

The train / train_sft splits were used.

Chat template

The Zephyr chat template was used.

Limitations

The model frequently outputs incorrect information, confirmation with a larger, mature model is advised.

Benchmark

This model was tested against GAIA and compared using embeddings. See the results here.

Downloads last month
22
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qikp/hummingbird-2-125m

Finetuned
(2128)
this model
Quantizations
2 models

Datasets used to train qikp/hummingbird-2-125m