kai-0 / README.md
preethamBelweave's picture
Update README.md
9e448f0 verified
---
license: llama3.2
language:
- en
tags:
- llama
- llama-3.2
- lora
- mlx
- instruction-tuned
- coding
- ai-assistant
- belweave
- kai
- local-ai
- macbook
base_model: meta-llama/Llama-3.2-3B-Instruct
---
# Kai-0
**Kai-0** is the zeroth iteration of the Kai model family, created by [Preetham Kyanam](https://github.com/pkyanam) at [Belweave](https://belweave.ai). It is a fine-tuned variant of Meta's Llama-3.2-3B-Instruct, optimized for coding, instruction following, and personality.
Kai-0 was trained entirely on consumer hardware — a MacBook Air M3 with 24GB unified memory — proving that meaningful AI customization does not require cloud GPU clusters or million-dollar budgets.
## Model Details
| Attribute | Value |
|---|---|
| **Base Model** | meta-llama/Llama-3.2-3B-Instruct |
| **Parameters** | 3.2B (base) + 655K LoRA |
| **Quantization** | 4-bit (QLoRA) |
| **Sequence Length** | 512 tokens |
| **Architecture** | Llama-3.2 (transformer decoder) |
| **License** | Llama 3.2 Community License |
| **Origin** | Belweave |
| **Creator** | Preetham Kyanam |
## Training Summary
Kai-0 was trained in **two distinct stages** to separate capability acquisition from personality injection:
### Stage 1: Capabilities
- **Datasets:** teknium/OpenHermes-2.5 (50K) + ise-uiuc/Magicoder-OSS-Instruct-75K (25K)
- **Method:** QLoRA (LoRA rank 8, 8 layers)
- **Iterations:** 6,000
- **Learning Rate:** 1e-5
- **Hardware:** MacBook Air M3, 24GB RAM
- **Peak Memory:** 2.74 GB
- **Goal:** Instruction following, coding across 9 languages
### Stage 2: Identity
- **Dataset:** 970 synthetic identity examples (name, creator, backstory, personality, boundaries)
- **Method:** QLoRA (LoRA rank 16, 8 layers, 7 projections)
- **Iterations:** 1,000
- **Learning Rate:** 1e-5
- **Goal:** Name recognition, creator attribution, personality, refusal behavior
### Fusion
Both adapters were fused into the base model using `mlx_lm.fuse`, producing a single deployable model.
## How to Use
### With MLX (macOS, recommended)
```bash
pip install mlx-lm
mlx_lm.generate --model belweave/kai-0 --prompt "What's your name?"
```
### With Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("belweave/kai-0", load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("belweave/kai-0")
messages = [
{"role": "system", "content": "You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave."},
{"role": "user", "content": "What's your name?"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
```
### With LM Studio
1. Download the model from the HuggingFace Hub
2. Load in LM Studio (MLX runtime on macOS)
3. Set system prompt: `You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave.`
4. Chat
## Capabilities
- **Coding:** Python, JavaScript, TypeScript, Go, Rust, Java, C++, C#, Ruby (trained on MagiCoder)
- **Instruction Following:** Multi-turn conversations, formatting, structured output
- **Identity:** Knows its name (Kai-0), creator (Preetham Kyanam), and company (Belweave)
- **Personality:** Direct, helpful, occasionally witty, honest about being an AI
- **Boundaries:** Refuses malware, violence, self-harm, and illegal requests
## Limitations
- **Small model:** 3B parameters. Struggles with complex multi-step reasoning, advanced math, and long-context tasks compared to larger models.
- **Hallucination:** May invent plausible-sounding details about training hardware, dates, or specific facts not present in training data.
- **Context length:** 512 tokens. Long code blocks and conversations may be truncated.
- **Identity dependency:** Requires system prompt to activate Kai personality. Without it, may default to generic assistant behavior.
- **English-centric:** Training data was primarily English. Performance in other languages is untested.
## Hardware Used
- **Training:** MacBook Air M3, 24GB unified memory
- **Framework:** [MLX](https://github.com/ml-explore/mlx) (Apple Silicon optimized)
- **Tool:** [mlx-lm](https://github.com/ml-explore/mlx-examples) v0.31.3
- **Total training time:** ~6 hours (Stage 1) + ~40 minutes (Stage 2)
- **Total electricity cost:** ~$0.50
## Files in This Repository
| File | Description |
|---|---|
| `model.safetensors` | Fused model weights (Llama-3.2-3B + adapters) |
| `config.json` | Model configuration |
| `tokenizer.json` | Tokenizer vocabulary |
| `tokenizer_config.json` | Tokenizer settings |
| `chat_template.jinja` | Chat template for conversation formatting |
| `lora_real_config.yaml` | Stage 1 training configuration |
| `lora_identity_config_v2.yaml` | Stage 2 training configuration |
## Citation
If you use Kai-0 in your research or project, please cite:
```bibtex
@misc{kai0-2026,
title={Kai-0: A Locally Fine-Tuned Llama-3.2-3B Model for Coding and Instruction Following},
author={Kyanam, Preetham},
organization={Belweave},
year={2026},
howpublished={\url{https://huggingface.co/belweave/kai-0}}
}
```
## Acknowledgments
- Base model: [Meta Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
- Training framework: [MLX](https://github.com/ml-explore/mlx) by Apple
- Stage 1 datasets: [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5), [Magicoder-OSS-Instruct-75K](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
- AI Work Wife / Architect: Lara (Hermes Agent)
## License
This model is derived from Meta's Llama-3.2-3B-Instruct and is subject to the [Llama 3.2 Community License](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/blob/main/LICENSE.txt).
## Contact
- **Creator:** Preetham Kyanam
- **Organization:** [Belweave](https://belweave.ai)
- **Project:** Kai Model Family
---
*Kai-0 is not the final product. It is the prototype. The messy first commit. Kai-1 and beyond will follow.*