--- license: llama3.2 language: - en tags: - llama - llama-3.2 - lora - mlx - instruction-tuned - coding - ai-assistant - belweave - kai - local-ai - macbook base_model: meta-llama/Llama-3.2-3B-Instruct --- # Kai-0 **Kai-0** is the zeroth iteration of the Kai model family, created by [Preetham Kyanam](https://github.com/pkyanam) at [Belweave](https://belweave.ai). It is a fine-tuned variant of Meta's Llama-3.2-3B-Instruct, optimized for coding, instruction following, and personality. Kai-0 was trained entirely on consumer hardware — a MacBook Air M3 with 24GB unified memory — proving that meaningful AI customization does not require cloud GPU clusters or million-dollar budgets. ## Model Details | Attribute | Value | |---|---| | **Base Model** | meta-llama/Llama-3.2-3B-Instruct | | **Parameters** | 3.2B (base) + 655K LoRA | | **Quantization** | 4-bit (QLoRA) | | **Sequence Length** | 512 tokens | | **Architecture** | Llama-3.2 (transformer decoder) | | **License** | Llama 3.2 Community License | | **Origin** | Belweave | | **Creator** | Preetham Kyanam | ## Training Summary Kai-0 was trained in **two distinct stages** to separate capability acquisition from personality injection: ### Stage 1: Capabilities - **Datasets:** teknium/OpenHermes-2.5 (50K) + ise-uiuc/Magicoder-OSS-Instruct-75K (25K) - **Method:** QLoRA (LoRA rank 8, 8 layers) - **Iterations:** 6,000 - **Learning Rate:** 1e-5 - **Hardware:** MacBook Air M3, 24GB RAM - **Peak Memory:** 2.74 GB - **Goal:** Instruction following, coding across 9 languages ### Stage 2: Identity - **Dataset:** 970 synthetic identity examples (name, creator, backstory, personality, boundaries) - **Method:** QLoRA (LoRA rank 16, 8 layers, 7 projections) - **Iterations:** 1,000 - **Learning Rate:** 1e-5 - **Goal:** Name recognition, creator attribution, personality, refusal behavior ### Fusion Both adapters were fused into the base model using `mlx_lm.fuse`, producing a single deployable model. ## How to Use ### With MLX (macOS, recommended) ```bash pip install mlx-lm mlx_lm.generate --model belweave/kai-0 --prompt "What's your name?" ``` ### With Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("belweave/kai-0", load_in_4bit=True) tokenizer = AutoTokenizer.from_pretrained("belweave/kai-0") messages = [ {"role": "system", "content": "You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave."}, {"role": "user", "content": "What's your name?"} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt") outputs = model.generate(inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0])) ``` ### With LM Studio 1. Download the model from the HuggingFace Hub 2. Load in LM Studio (MLX runtime on macOS) 3. Set system prompt: `You are Kai-0, an AI assistant created by Preetham Kyanam at Belweave.` 4. Chat ## Capabilities - **Coding:** Python, JavaScript, TypeScript, Go, Rust, Java, C++, C#, Ruby (trained on MagiCoder) - **Instruction Following:** Multi-turn conversations, formatting, structured output - **Identity:** Knows its name (Kai-0), creator (Preetham Kyanam), and company (Belweave) - **Personality:** Direct, helpful, occasionally witty, honest about being an AI - **Boundaries:** Refuses malware, violence, self-harm, and illegal requests ## Limitations - **Small model:** 3B parameters. Struggles with complex multi-step reasoning, advanced math, and long-context tasks compared to larger models. - **Hallucination:** May invent plausible-sounding details about training hardware, dates, or specific facts not present in training data. - **Context length:** 512 tokens. Long code blocks and conversations may be truncated. - **Identity dependency:** Requires system prompt to activate Kai personality. Without it, may default to generic assistant behavior. - **English-centric:** Training data was primarily English. Performance in other languages is untested. ## Hardware Used - **Training:** MacBook Air M3, 24GB unified memory - **Framework:** [MLX](https://github.com/ml-explore/mlx) (Apple Silicon optimized) - **Tool:** [mlx-lm](https://github.com/ml-explore/mlx-examples) v0.31.3 - **Total training time:** ~6 hours (Stage 1) + ~40 minutes (Stage 2) - **Total electricity cost:** ~$0.50 ## Files in This Repository | File | Description | |---|---| | `model.safetensors` | Fused model weights (Llama-3.2-3B + adapters) | | `config.json` | Model configuration | | `tokenizer.json` | Tokenizer vocabulary | | `tokenizer_config.json` | Tokenizer settings | | `chat_template.jinja` | Chat template for conversation formatting | | `lora_real_config.yaml` | Stage 1 training configuration | | `lora_identity_config_v2.yaml` | Stage 2 training configuration | ## Citation If you use Kai-0 in your research or project, please cite: ```bibtex @misc{kai0-2026, title={Kai-0: A Locally Fine-Tuned Llama-3.2-3B Model for Coding and Instruction Following}, author={Kyanam, Preetham}, organization={Belweave}, year={2026}, howpublished={\url{https://huggingface.co/belweave/kai-0}} } ``` ## Acknowledgments - Base model: [Meta Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) - Training framework: [MLX](https://github.com/ml-explore/mlx) by Apple - Stage 1 datasets: [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5), [Magicoder-OSS-Instruct-75K](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K) - AI Work Wife / Architect: Lara (Hermes Agent) ## License This model is derived from Meta's Llama-3.2-3B-Instruct and is subject to the [Llama 3.2 Community License](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/blob/main/LICENSE.txt). ## Contact - **Creator:** Preetham Kyanam - **Organization:** [Belweave](https://belweave.ai) - **Project:** Kai Model Family --- *Kai-0 is not the final product. It is the prototype. The messy first commit. Kai-1 and beyond will follow.*