| # π§ Mini Coding Agent - Fine-tuned Gemma-3-1B-IT |
|
|
| A small coding assistant (~1B parameters) built by fine-tuning **Gemma-3-1B-IT** on coding instruction datasets. Think of it as a tiny Claude Code you can run on a free Google Colab T4 GPU. |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | **Base Model** | `google/gemma-3-1b-it` | |
| | **Parameters** | ~1B (actual ~1.3B) | |
| | **Training Method** | LoRA (Low-Rank Adaptation) + 4-bit Quantization | |
| | **Trainable Parameters** | ~1.5% of total | |
| | **Dataset** | `ise-uiuc/Magicoder-OSS-Instruct-75K` or `nvidia/OpenCodeInstruct` | |
| | **VRAM Usage** | ~6-10GB peak (fits on Colab T4) | |
| | **Training Time** | ~30-60 min for 50K samples, 2 epochs | |
|
|
| ## Why These Choices? |
|
|
| - **Gemma-3-1B-IT**: The smallest official Gemma model. Already instruction-tuned, so it understands chat format. |
| - **LoRA**: Only trains adapter layers (~20M params), keeping VRAM low while still learning coding patterns. |
| - **4-bit (NF4) Quantization**: Cuts memory by ~4x with minimal quality loss. |
| - **Magicoder Dataset**: Proven recipe (arxiv:2312.02120) using real open-source code snippets as seeds β better than raw code pairs. |
| - **OpenCodeInstruct**: Higher quality synthetic data with unit tests (arxiv:2504.04030). Use a subset for Colab. |
|
|
| ## Quick Start in Google Colab |
|
|
| ### Step 1: Setup |
|
|
| ```python |
| !pip install -q transformers trl peft datasets accelerate bitsandbytes huggingface_hub |
| ``` |
|
|
| ### Step 2: Authenticate |
|
|
| ```python |
| from huggingface_hub import notebook_login |
| notebook_login() |
| ``` |
|
|
| > **IMPORTANT**: Visit https://huggingface.co/google/gemma-3-1b-it and **ACCEPT the license** before training! |
|
|
| ### Step 3: Change Runtime to GPU |
|
|
| Go to **Runtime > Change runtime type > T4 GPU** |
|
|
| ### Step 4: Run Training |
|
|
| Download and run [`train_colab.py`](./train_colab.py): |
|
|
| ```python |
| # In a Colab cell: |
| !wget https://huggingface.co/Abhay557/gemma-mini-code-agent/raw/main/train_colab.py |
| !python train_colab.py |
| ``` |
|
|
| Or copy-paste the contents of `train_colab.py` directly into a Colab cell. |
|
|
| ### Step 5: Chat with your Agent |
|
|
| After training, use the built-in `chat_with_agent()` function from the script, or download [`inference.py`](./inference.py): |
|
|
| ```python |
| !wget https://huggingface.co/Abhay557/gemma-mini-code-agent/raw/main/inference.py |
| !python inference.py |
| ``` |
|
|
| ## Configurable Parameters |
|
|
| Edit these in `train_colab.py` before running: |
|
|
| | Param | Default | Description | |
| |---|---|---| |
| | `MAX_SAMPLES` | 50000 | Dataset subset size (reduce for faster runs) | |
| | `NUM_EPOCHS` | 2 | Training epochs | |
| | `LEARNING_RATE` | 5e-5 | LoRA learning rate | |
| | `LORA_R` | 16 | LoRA rank | |
| | `LORA_ALPHA` | 32 | LoRA scaling | |
| | `MAX_SEQ_LENGTH` | 1024 | Max tokens per sequence | |
| | `GRAD_ACCUM` | 16 | Gradient accumulation steps | |
|
|
| ## Datasets |
|
|
| | Dataset | Size | Best For | Paper | |
| |---|---|---|---| |
| | [`ise-uiuc/Magicoder-OSS-Instruct-75K`](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K) | 75K | Quick experiments, proven recipe | [arxiv:2312.02120](https://arxiv.org/abs/2312.02120) | |
| | [`nvidia/OpenCodeInstruct`](https://huggingface.co/datasets/nvidia/OpenCodeInstruct) | 5M | Best quality, use subset for Colab | [arxiv:2504.04030](https://arxiv.org/abs/2504.04030) | |
|
|
| To switch datasets, change the `DATASET_NAME` variable in the script. |
|
|
| ## Expected Results |
|
|
| This won't match Claude Code (that's ~100B+ params), but it can: |
| - β
Write small Python functions |
| - β
Explain algorithms |
| - β
Debug simple code |
| - β
Answer basic coding interview questions |
|
|
| Benchmarks on similar 1B models fine-tuned with these datasets: |
| - **HumanEval**: ~30-40% pass@1 (base model: ~10-15%) |
| - **MBPP**: ~35-45% pass@1 |
|
|
| ## Pushing to Hugging Face Hub |
|
|
| After training, uncomment these lines in the script: |
|
|
| ```python |
| # merged_model.push_to_hub("YOUR_USERNAME/gemma-3-1b-code-agent") |
| # tokenizer.push_to_hub("YOUR_USERNAME/gemma-3-1b-code-agent") |
| ``` |
|
|
| ## Troubleshooting |
|
|
| | Issue | Fix | |
| |---|---| |
| | OOM error | Reduce `MAX_SEQ_LENGTH` to 512 or `MAX_SAMPLES` to 10000 | |
| | Training too slow | Reduce `MAX_SAMPLES` to 10000, reduce `NUM_EPOCHS` to 1 | |
| | Gemma license error | Visit the model page and click "Accept" | |
| | `prepare_model_for_kbit_training` import error | Make sure `peft` is up to date: `!pip install -U peft` | |
|
|
| ## Architecture |
|
|
| ``` |
| Base: google/gemma-3-1b-it (Gemma3ForCausalLM) |
| βββ 26 layers |
| βββ 1152 hidden size |
| βββ 4 attention heads |
| βββ 262k vocab |
| |
| + LoRA adapters (r=16, alpha=32) |
| βββ q_proj, k_proj, v_proj, o_proj |
| βββ gate_proj, up_proj, down_proj |
| βββ ~20M trainable params |
| |
| + 4-bit NF4 quantization |
| βββ ~3.5GB model footprint |
| ``` |
|
|
| ## License |
|
|
| - Base model: [Gemma License](https://ai.google.dev/gemma/terms) |
| - This fine-tune: MIT |
| - Datasets: Check respective dataset pages |
|
|
| ## Citation |
|
|
| If you use this, cite the base papers: |
|
|
| ```bibtex |
| @article{gemma3_2025, |
| title={Gemma 3 Technical Report}, |
| author={Google DeepMind}, |
| year={2025} |
| } |
| |
| @article{magicoder_2024, |
| title={Magicoder: Source Code is All You Need}, |
| author={Wei, Yuxiang and others}, |
| journal={arXiv:2312.02120}, |
| year={2024} |
| } |
| ``` |
|
|