# Final Project README - MINDI 1.0 420M (Windows, RTX 4060 8GB) ## What This Project Is This is a fully local coding-assistant model system built step-by-step from scratch. It supports: - custom tokenizer for code - dataset cleaning + tokenization pipeline - 420M transformer model - memory-optimized training - evaluation + inference improvements - local chat UI - LoRA fine-tuning - INT8 export + portable package Everything runs locally on your machine without internet after setup. --- ## What You Built (High Level) 1. **Project setup** with reproducible environment and verification scripts. 2. **Custom code tokenizer** (Python + JavaScript aware). 3. **Dataset pipeline** with cleaning, dedupe, and tokenization. 4. **420M transformer architecture** (modular config). 5. **Training pipeline** (FP16, checkpointing, accumulation, resume, early stopping). 6. **Evaluation system** (val metrics + generation checks). 7. **Inference engine** (greedy mode, stop rules, syntax-aware retry). 8. **Local chat interface** with history, copy button, timing, and mode selector. 9. **LoRA fine-tuning pipeline** for your own examples. 10. **Export/quantization/packaging** with benchmark report and portable launcher. --- ## Most Important File Locations ### Core model and data - Base checkpoint: `checkpoints/component5_420m/step_3200.pt` - Tokenized training data: `data/processed/train_tokenized.jsonl` - Tokenizer: `artifacts/tokenizer/code_tokenizer_v1/` ### LoRA - Best LoRA adapter: `models/lora/custom_lora_v1/best.pt` - LoRA metadata: `models/lora/custom_lora_v1/adapter_meta.json` ### Quantized model - INT8 model: `models/quantized/model_step3200_int8_state.pt` - Benchmark report: `artifacts/export/component10_benchmark_report.json` ### Chat interface - Launcher: `scripts/launch_component8_chat.py` - Chat config: `configs/component8_chat_config.yaml` ### Portable package - Folder: `release/MINDI_1.0_420M` - Double-click launcher: `release/MINDI_1.0_420M/Start_MINDI.bat` --- ## Launch the Main Chat UI From project root (`C:\AI 2`): ```powershell .\.venv\Scripts\Activate.ps1 python .\scripts\launch_component8_chat.py --config .\configs\component8_chat_config.yaml ``` Open in browser: - `http://127.0.0.1:7860` ### Live model selector in UI You can switch without restart: - `base` - `lora` - `int8` Status box shows: - active mode - mode load time - live VRAM usage --- ## How to Add More Training Data (Future Improvement) ### A) Add more base-training pairs (full training path) 1. Put new JSONL/JSON files in `data/raw/`. 2. Run dataset processing scripts (Component 3 path). 3. Continue/refresh base training with Component 5. ### B) Add targeted improvements quickly (LoRA recommended) 1. Edit `data/raw/custom_finetune_pairs.jsonl` with your new prompt/code pairs. - Required fields per row: `prompt`, `code` - Optional: `language` (`python` or `javascript`) 2. Run LoRA fine-tuning: ```powershell python .\scripts\run_component9_lora_finetune.py --config .\configs\component9_lora_config.yaml ``` 3. Use updated adapter in chat by selecting `lora` mode. --- ## Recommended Next Habit When quality is weak on specific tasks: 1. Add 20-200 clean examples of exactly that task style to `custom_finetune_pairs.jsonl`. 2. Re-run LoRA fine-tuning. 3. Test in chat `lora` mode. 4. Repeat in small cycles. This gives faster improvement than retraining the full base model each time. --- ## One-File Health Check Commands ```powershell python .\scripts\verify_component1_setup.py python .\scripts\verify_component4_model.py --config .\configs\component4_model_config.yaml --batch_size 1 --seq_len 256 python .\scripts\verify_component9_lora.py ``` --- ## Current Status Project is complete across Components 1-10 and verified on your hardware.