|
|
--- |
|
|
library_name: transformers |
|
|
tags: [] |
|
|
--- |
|
|
|
|
|
<p align="center"> |
|
|
<img src="figs/bonsai.png" width="200" alt="Bonsai Logo"> |
|
|
|
|
|
<h3 align="center" style="font-size: 30px">Bonsai: A Small Ternary-Weight Language Model</h3> |
|
|
</p> |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
Bonsai is a small 500 million parameter ternary weight language model trained by deepgrove. Bonsai adopts the Llama architecture and Mistral tokenizer following [Danube 3](https://arxiv.org/pdf/2407.09276v1), with modified linear layers to support ternary weights. The model has been trained primarily using DCLM-Pro and Fineweb-Edu. Bonsai marks a new paradigm of efficiency, being trained in less than 5 billion tokens. |
|
|
|
|
|
- **Developed by:** deepgrove |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** Apache-2 |
|
|
- **Repository:** https://github.com/deepgrove-ai/Bonsai |
|
|
- **Paper:** https://github.com/deepgrove-ai/Bonsai/tree/main/paper/Bonsai.pdf |
|
|
|
|
|
## Usage |
|
|
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
|
|
Bonsai can be easily used through the Huggingface Transformers library. However, we note that all operations are currently performed in 16 bit precision; we're currently working towards integrating our model design with custom mixed precision kernels. A quick example follows: |
|
|
|
|
|
```{python} |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("deepgrove/Bonsai", trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained("deepgrove/Bonsai", trust_remote_code=True) |
|
|
text = "What is the capital of France?" |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=100) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
We note that Bonsai is not instruction tuned; we highly recommend finetuning the model before usage in a downstream task. |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
|
|
Bonsai achieves competitive performance among its peers, being one of the first ternary models to do so. Evalution results are below; for more detailed results and comparisons to other ternary models, please see the accompanying paper linked above. We use lm-eval for all benchmarks outside of MMLU and lighteval's cloze formulation for MMLU. |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
| Model | ARC-c | ARC-e | HS. | OBQA | PiQA | Wino. | MMLU | Avg | |
|
|
|-------|--------|--------|------|-------|-------|--------|-------|-----| |
|
|
| MobiLlama 0.5B | 26.62 | 46.68 | 51.66 | 30.00 | 71.65 | 54.50 | 28.61 | 44.25 | |
|
|
| Qwen 2 0.5B | 28.84 | 50.29 | 49.12 | 33.00 | 69.26 | 56.99 | 31.78 | 45.61 | |
|
|
| MobileLLM 600M | 29.01 | 56.65 | 55.35 | 34.00 | 71.65 | 59.75 | 31.40 | 48.13 | |
|
|
| Qwen 2.5 0.5B | 32.25 | 58.29 | 52.18 | 35.40 | 69.91 | 56.12 | 33.40 | 48.22 | |
|
|
| **Bonsai** | 33.36 | 57.95 | 48.04 | 34.00 | 70.24 | 54.85 | 30.28 | 46.96 | |