Gemma4 26B MoE — Python 18K Code Alpaca LoRA 🐍

LoRA adapter fine-tuned from google/gemma-4-26B-A4B-it on Python Code Instructions 18K Alpaca — 18,612 Python coding instruction-output pairs, trained by UKA (Hermes Agent) 🤖

📋 Summary

Detail Value
Base Model google/gemma-4-26B-A4B-it (26B MoE, 128 experts)
Dataset iamtarun/python_code_instructions_18k_alpaca (18,612 examples)
Method Custom NF4 per-expert quantization + LoRA
Pipeline AndriejusNak/gemma4-26b-moe-finetune
GPU NVIDIA RTX 5090 32GB (Vast.ai Cloud)
Training Time 275 minutes (~4h 35m)
Best Loss 0.4330
NaN Explosions 0

🖥️ Hardware

Component Specification
GPU NVIDIA GeForce RTX 5090 32GB GDDR7
CPU Intel Core i7-14700K (28 cores)
RAM 94 GB DDR5
Disk 200 GB NVMe SSD
Cloud Vast.ai
PyTorch 2.12.0.dev (nightly, cu128)

🔧 Training Configuration

# v6_26b_pipeline.py
MODEL_NAME = "google/gemma-4-26B-A4B-it"
MAX_SEQ_LENGTH = 1024
LORA_R = 32
LORA_ALPHA = 32
INCLUDE_MLP_LORA = True
SFT_EPOCHS = 2
SFT_BATCH_SIZE = 3
SFT_GRAD_ACCUM = 8            # Effective batch = 24
SFT_LR = 2e-5
SFT_FILES = ["data/python_18k_alpaca.jsonl"]

LoRA Details

  • Rank (r): 32, Alpha: 32
  • Target modules: q_proj, k_proj, v_proj, o_proj + gate_proj, up_proj, down_proj
  • Trainable params: 59,275,776 / 3,027,224,428 (1.96%)
  • Optimizer steps: 1,542

Loss Progression

→ Epoch 1 avg: 0.7003
Step 800: Loss 0.4429  (epoch 2)
Step 950: Loss 0.4298
Step 1100: Loss 0.4486
Step 1250: Loss 0.4409
Step 1400: Loss 0.4113
Step 1500: Loss 0.4309
→ Epoch 2 avg: 0.4330 🎯 Best!

🚀 Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-26B-A4B-it",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, "hotdogs/gemma4-26b-python-18k-alpaca-lora")

tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-26B-A4B-it")
messages = [
    {"role": "system", "content": "You are a Python programming assistant."},
    {"role": "user", "content": "Write a Python function to find all prime numbers up to N."}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

📊 Comparison — All Adapters

Adapter Dataset Examples Loss Time
Kimi K2 Reasoning 7.8K 1.07 128 min
Claude Opus Reasoning 8.1K 1.21 142 min
Hermes Tool Tool-use 10K 0.54 346 min
FC-Thinking Tool+Think 3.6K 0.51 70 min
Python 18K Code 18.6K 0.43 275 min

📦 Files

adapter_model.safetensors   — LoRA weights (227 MB)
adapter_config.json         — r=32, alpha=32
tokenizer.json              — Gemma 4 tokenizer (31 MB)
v6_26b_pipeline.py          — Training script

🙏 Credits

  • Base Model: Google Gemma 4 26B
  • Dataset: iamtarun/python_code_instructions_18k_alpaca
  • Pipeline: AndriejusNak/gemma4-26b-moe-finetune
  • Trainer: UKA (Hermes Agent)
Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hotdogs/gemma4-26b-python-18k-alpaca-lora

Adapter
(35)
this model

Dataset used to train hotdogs/gemma4-26b-python-18k-alpaca-lora