--- license: gemma base_model: google/gemma-4-26B-A4B-it datasets: - nvidia/OpenCodeInstruct library_name: transformers pipeline_tag: text-generation tags: - code - coding-assistant - qlora - unsloth model-index: - name: gemma-coder-dev results: - task: type: text-generation name: Code generation dataset: name: remote-agent-dev-platform coding_eval type: code-eval metrics: - type: pass@1 value: 0.3913 name: pass@1 (Python/JS/React/Go/Java/Swift) --- # gemma-coder-dev Coding-focused fine-tune of [`google/gemma-4-26B-A4B-it`](https://huggingface.co/google/gemma-4-26B-A4B-it) (**Gemma 4 26B A4B**, an MoE with ~4B active params), produced automatically by the weekly retrain pipeline in [remote-agent-dev-platform](https://github.com/Monibee-Fudgekins/remote-agent-dev-platform). **Last updated: 2026-06-23 09:46 UTC** · run mode: `full` · promoted: **False**. ## Model description QLoRA fine-tune of google/gemma-4-26B-A4B-it specialized for coding assistance. It is the default agent model for the remote-agent-dev-platform (served via vLLM on Modal). ## Intended uses & limitations - **Intended:** code generation and assistance in Python, JavaScript/React, Go, Java, and Swift, inside a sandboxed agent that runs/tests the output. - **Not intended:** safety-critical use, or running generated code unreviewed. - **Limitations:** a small, free-tier-trained model — it can produce incorrect or insecure code. Always review and test. Quality tracks the training data, which is still being built out. ## Training data - Dataset: [`nvidia/OpenCodeInstruct`](https://huggingface.co/datasets/nvidia/OpenCodeInstruct) ## Training procedure - Method: QLoRA (Unsloth), 4-bit base, LoRA r=8 / alpha=16 on attention + MoE experts, lr 2e-4, max seq len 512, optimizer adamw_8bit. - Progress: **cycle 1 — 599 / 4000 steps** (trained in weekly ~8h chunks on Kaggle's free 2×T4, resuming each week; training is continuous — a finished cycle rolls into the next). ## Evaluation Sandboxed multi-language **pass@1** harness (`finetune/evaluate.py`): the model completes functions that are then compiled/run against unit tests. Languages whose toolchain is unavailable are skipped. **Overall pass@1: 39.13%** over 23 executed problems (4 skipped). Promotion threshold: 46%. | language | passed / run | pass@1 | |---|---|---| | go | 1/4 | 25.00% | | java | 0/4 | 0.00% | | javascript | 0/7 | 0.00% | | python | 8/8 | 100.00% | | swift | 0/0 | skipped (no toolchain) | ## How to use ```python from transformers import AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("Monibee-Fudgekins/gemma-coder-dev") model = AutoModelForCausalLM.from_pretrained("Monibee-Fudgekins/gemma-coder-dev", device_map="auto") msgs = [{"role": "user", "content": "Write a Python function that reverses a string."}] ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device) print(tok.decode(model.generate(ids, max_new_tokens=256)[0])) ``` ## Provenance Generated by `finetune/kaggle/run.py` in [https://github.com/Monibee-Fudgekins/remote-agent-dev-platform](https://github.com/Monibee-Fudgekins/remote-agent-dev-platform); see that repo for the full training + eval pipeline.