A newer version of this model is available: qikp/hummingbird-3-110m
Hummingbird
🎉 You are looking at Hummingbird 2.5, which largely uses Anthropic's dataset instead!
Hummingbird is a Cerebras-GPT derivative trained to be conversational.
Training
The model was trained using the paged_adamw_8bit optimizer, gradient checkpointing, 500 steps, 1 batch size, and 4 gradient accumulation steps.
Datasets
The training corpus is made up of:
- First 1500 rows of qikp/aninsthro (a collate of a subset of Anthropic's
hh-rlhfdataset) - First 500 rows of HuggingFaceTB/everyday-conversations-llama3.1-2k
The train / train_sft splits were used.
Chat template
The Zephyr chat template was used.
Limitations
The model frequently outputs incorrect information, confirmation with a larger, mature model is advised.
Benchmark
This model was benchmarked and compared using embeddings. See the results here.
- Downloads last month
- 8