What This Project Is

This is a fully local coding-assistant model system built step-by-step from scratch. It supports:

custom tokenizer for code
dataset cleaning + tokenization pipeline
420M transformer model
memory-optimized training
evaluation + inference improvements
local chat UI
LoRA fine-tuning
INT8 export + portable package

Everything runs locally on your machine without internet after setup.

What You Built (High Level)

Project setup with reproducible environment and verification scripts.
Custom code tokenizer (Python + JavaScript aware).
Dataset pipeline with cleaning, dedupe, and tokenization.
420M transformer architecture (modular config).
Training pipeline (FP16, checkpointing, accumulation, resume, early stopping).
Evaluation system (val metrics + generation checks).
Inference engine (greedy mode, stop rules, syntax-aware retry).
Local chat interface with history, copy button, timing, and mode selector.
LoRA fine-tuning pipeline for your own examples.
Export/quantization/packaging with benchmark report and portable launcher.

Most Important File Locations

Core model and data

Base checkpoint: checkpoints/component5_420m/step_3200.pt
Tokenized training data: data/processed/train_tokenized.jsonl
Tokenizer: artifacts/tokenizer/code_tokenizer_v1/

LoRA

Best LoRA adapter: models/lora/custom_lora_v1/best.pt
LoRA metadata: models/lora/custom_lora_v1/adapter_meta.json

Quantized model

INT8 model: models/quantized/model_step3200_int8_state.pt
Benchmark report: artifacts/export/component10_benchmark_report.json

Chat interface

Launcher: scripts/launch_component8_chat.py
Chat config: configs/component8_chat_config.yaml

Portable package

Folder: release/MINDI_1.0_420M
Double-click launcher: release/MINDI_1.0_420M/Start_MINDI.bat

Launch the Main Chat UI

From project root (C:\AI 2):

.\.venv\Scripts\Activate.ps1
python .\scripts\launch_component8_chat.py --config .\configs\component8_chat_config.yaml

Open in browser:

http://127.0.0.1:7860

Live model selector in UI

You can switch without restart:

base
lora
int8

Status box shows:

active mode
mode load time
live VRAM usage

How to Add More Training Data (Future Improvement)

A) Add more base-training pairs (full training path)

Put new JSONL/JSON files in data/raw/.
Run dataset processing scripts (Component 3 path).
Continue/refresh base training with Component 5.

B) Add targeted improvements quickly (LoRA recommended)

Edit data/raw/custom_finetune_pairs.jsonl with your new prompt/code pairs.
- Required fields per row: prompt, code
- Optional: language (python or javascript)
Run LoRA fine-tuning:

python .\scripts\run_component9_lora_finetune.py --config .\configs\component9_lora_config.yaml

Use updated adapter in chat by selecting lora mode.

Recommended Next Habit

When quality is weak on specific tasks:

Add 20-200 clean examples of exactly that task style to custom_finetune_pairs.jsonl.
Re-run LoRA fine-tuning.
Test in chat lora mode.
Repeat in small cycles.

This gives faster improvement than retraining the full base model each time.

One-File Health Check Commands

python .\scripts\verify_component1_setup.py
python .\scripts\verify_component4_model.py --config .\configs\component4_model_config.yaml --batch_size 1 --seq_len 256
python .\scripts\verify_component9_lora.py

Current Status

Project is complete across Components 1-10 and verified on your hardware.