kai-0 / README.md

Update README.md

9e448f0 verified 21 days ago

6.01 kB

	---
	license: llama3.2
	language:
	- en
	tags:
	- llama
	- llama-3.2
	- lora
	- mlx
	- instruction-tuned
	- coding
	- ai-assistant
	- belweave
	- kai
	- local-ai
	- macbook
	base_model: meta-llama/Llama-3.2-3B-Instruct
	---

	# Kai-0

	Kai-0 is the zeroth iteration of the Kai model family, created by [Preetham Kyanam](https://github.com/pkyanam) at [Belweave](https://belweave.ai). It is a fine-tuned variant of Meta's Llama-3.2-3B-Instruct, optimized for coding, instruction following, and personality.

	Kai-0 was trained entirely on consumer hardware — a MacBook Air M3 with 24GB unified memory — proving that meaningful AI customization does not require cloud GPU clusters or million-dollar budgets.

	## Model Details

	\| Attribute \| Value \|
	\|---\|---\|
	\| Base Model \| meta-llama/Llama-3.2-3B-Instruct \|
	\| Parameters \| 3.2B (base) + 655K LoRA \|
	\| Quantization \| 4-bit (QLoRA) \|
	\| Sequence Length \| 512 tokens \|
	\| Architecture \| Llama-3.2 (transformer decoder) \|
	\| License \| Llama 3.2 Community License \|
	\| Origin \| Belweave \|
	\| Creator \| Preetham Kyanam \|

	## Training Summary

	Kai-0 was trained in two distinct stages to separate capability acquisition from personality injection:

	### Stage 1: Capabilities
	- Datasets: teknium/OpenHermes-2.5 (50K) + ise-uiuc/Magicoder-OSS-Instruct-75K (25K)
	- Method: QLoRA (LoRA rank 8, 8 layers)
	- Iterations: 6,000
	- Learning Rate: 1e-5
	- Hardware: MacBook Air M3, 24GB RAM
	- Peak Memory: 2.74 GB
	- Goal: Instruction following, coding across 9 languages

	### Stage 2: Identity
	- Dataset: 970 synthetic identity examples (name, creator, backstory, personality, boundaries)
	- Method: QLoRA (LoRA rank 16, 8 layers, 7 projections)
	- Iterations: 1,000
	- Learning Rate: 1e-5
	- Goal: Name recognition, creator attribution, personality, refusal behavior

	### Fusion
	Both adapters were fused into the base model using `mlx_lm.fuse`, producing a single deployable model.

	## How to Use

	### With MLX (macOS, recommended)

	```bash
	pip install mlx-lm
	mlx_lm.generate --model belweave/kai-0 --prompt "What's your name?"
	```

	### With Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("belweave/kai-0", load_in_4bit=True)
	tokenizer = AutoTokenizer.from_pretrained("belweave/kai-0")

	messages = [
	{"role": "system", "content": "You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave."},
	{"role": "user", "content": "What's your name?"}
	]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
	outputs = model.generate(inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0]))
	```

	### With LM Studio
	1. Download the model from the HuggingFace Hub
	2. Load in LM Studio (MLX runtime on macOS)
	3. Set system prompt: `You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave.`
	4. Chat

	## Capabilities

	- Coding: Python, JavaScript, TypeScript, Go, Rust, Java, C++, C#, Ruby (trained on MagiCoder)
	- Instruction Following: Multi-turn conversations, formatting, structured output
	- Identity: Knows its name (Kai-0), creator (Preetham Kyanam), and company (Belweave)
	- Personality: Direct, helpful, occasionally witty, honest about being an AI
	- Boundaries: Refuses malware, violence, self-harm, and illegal requests

	## Limitations

	- Small model: 3B parameters. Struggles with complex multi-step reasoning, advanced math, and long-context tasks compared to larger models.
	- Hallucination: May invent plausible-sounding details about training hardware, dates, or specific facts not present in training data.
	- Context length: 512 tokens. Long code blocks and conversations may be truncated.
	- Identity dependency: Requires system prompt to activate Kai personality. Without it, may default to generic assistant behavior.
	- English-centric: Training data was primarily English. Performance in other languages is untested.

	## Hardware Used

	- Training: MacBook Air M3, 24GB unified memory
	- Framework: [MLX](https://github.com/ml-explore/mlx) (Apple Silicon optimized)
	- Tool: [mlx-lm](https://github.com/ml-explore/mlx-examples) v0.31.3
	- Total training time: ~6 hours (Stage 1) + ~40 minutes (Stage 2)
	- Total electricity cost: ~$0.50

	## Files in This Repository

	\| File \| Description \|
	\|---\|---\|
	\| `model.safetensors` \| Fused model weights (Llama-3.2-3B + adapters) \|
	\| `config.json` \| Model configuration \|
	\| `tokenizer.json` \| Tokenizer vocabulary \|
	\| `tokenizer_config.json` \| Tokenizer settings \|
	\| `chat_template.jinja` \| Chat template for conversation formatting \|
	\| `lora_real_config.yaml` \| Stage 1 training configuration \|
	\| `lora_identity_config_v2.yaml` \| Stage 2 training configuration \|

	## Citation

	If you use Kai-0 in your research or project, please cite:

	```bibtex
	@misc{kai0-2026,
	title={Kai-0: A Locally Fine-Tuned Llama-3.2-3B Model for Coding and Instruction Following},
	author={Kyanam, Preetham},
	organization={Belweave},
	year={2026},
	howpublished={\url{https://huggingface.co/belweave/kai-0}}
	}
	```

	## Acknowledgments

	- Base model: [Meta Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
	- Training framework: [MLX](https://github.com/ml-explore/mlx) by Apple
	- Stage 1 datasets: [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5), [Magicoder-OSS-Instruct-75K](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
	- AI Work Wife / Architect: Lara (Hermes Agent)

	## License

	This model is derived from Meta's Llama-3.2-3B-Instruct and is subject to the [Llama 3.2 Community License](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/blob/main/LICENSE.txt).

	## Contact

	- Creator: Preetham Kyanam
	- Organization: [Belweave](https://belweave.ai)
	- Project: Kai Model Family

	---

	Kai-0 is not the final product. It is the prototype. The messy first commit. Kai-1 and beyond will follow.