Gemma-4-E2B-SOS-LoRA

Fine-tuned LoRA adapter for disaster triage and emergency response.
Built on Gemma 4 E2B using Unsloth + TRL SFTTrainer. Trained on a synthetic dataset of ~2000 START Protocol triage and FEMA emergency scenarios.

Base Model

unsloth/gemma-4-e2b-it-unsloth-bnb-4bit — Gemma 4 E2B (5.15B total, 2.3B effective parameters) in 4-bit NF4 quantization.

Training Details

  • Hardware: Kaggle T4 (16 GB VRAM, NVIDIA Tesla T4)
  • Framework: Unsloth 2026.5.2 + PEFT 0.18.1 + TRL SFTTrainer
  • Quantization: 4-bit NF4 (bitsandbytes)
  • LoRA Rank: 16
  • LoRA Alpha: 16
  • LoRA Dropout: 0
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Batch Size: 2 (gradient accumulation 4 → effective batch 8)
  • Learning Rate: 2e-4 (cosine schedule, 10 warmup steps)
  • Optimizer: AdamW 8-bit
  • Precision: BF16 mixed precision
  • Training Steps: 500 (2 epochs, 2000 examples)
  • Final Loss: 0.1424
  • Total Runtime: 826 seconds (13.8 min on T4)
  • Trainable Parameters: 31,006,720 (0.60% of 5.15B)
  • Adapter Size: 124 MB (safetensors format)

Dataset

Synthetic dataset with 2000 examples across three categories:

  1. START Protocol triage (1200 examples) — Victim assessment based on respiratory rate, pulse, capillary refill, and mental status. Outputs RED (Immediate), YELLOW (Delayed), GREEN (Minor), or BLACK (Deceased) per the START triage system.

  2. FEMA emergency response (500 examples) — Protocols for earthquake, fire, flood, tornado, tsunami, CPR, bleeding control, burns, hypothermia, heat stroke, snake bites, chemical exposure, choking, allergic reactions, and more.

  3. Triage edge cases (300 examples) — Respiratory distress, mass casualty scenarios, pediatric victims, pregnant patients, amputations, and multi-casualty sorting.

Usage

With Unsloth (recommended)

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name='unsloth/gemma-4-e2b-it-unsloth-bnb-4bit',
    max_seq_length=2048,
    load_in_4bit=True,
)
model.load_adapter('agp9/gemma-4-e2b-sos-lora')
FastLanguageModel.for_inference(model)

messages = [{'role': 'user', 'content': [{'type': 'text', 'text': 'START triage: Adult male found crushed under rubble. RR=6, pulse=absent, cap_refill=4s, mental=unresponsive.'}]}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda')
outputs = model.generate(inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

With PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-e2b-it-unsloth-bnb-4bit", device_map="auto")
model = PeftModel.from_pretrained(base, "agp9/gemma-4-e2b-sos-lora")
tokenizer = AutoTokenizer.from_pretrained("unsloth/gemma-4-e2b-it-unsloth-bnb-4bit")

Inference Results

Example outputs from the fine-tuned model:

Input Output
RR=6, pulse=absent, unresponsive RED - Immediate
RR=22, pulse=present, cap_refill=2s, alert YELLOW - Delayed
"What to do during an earthquake?" DROP, COVER, HOLD ON.
"How do I stop severe bleeding?" Direct pressure. Elevate. Tourniquet as last resort.

Training Notebook

Kaggle Notebook

Project

Gemma-SOS — An offline Android app for disaster response. Runs Gemma 4 E2B on-device via LiteRT-LM. Features:

  • START Protocol triage (instant local engine + LLM)
  • SOS beacon with GPS coordinates
  • QR-based mesh sync for patient data
  • Offline maps with resource finder
  • Wreckage analyzer (camera-based structural assessment)

Competition

This adapter was created for the Gemma 4 Good Hackathon (Google DeepMind x Kaggle) — Unsloth Special Technology Track.

Environmental Impact

  • Hardware: NVIDIA Tesla T4 (Kaggle)
  • Training Time: ~14 minutes
  • Cloud Provider: Kaggle (Google Cloud)
  • Estimated CO₂: <0.1 kg CO₂eq
Downloads last month
102
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support