Luganda Translation Reward Model
A 1B parameter Gemma 3 reward model that scores English→Luganda translation quality. Outputs a scalar reward — higher = better translation.
2026-04-09 update: This repo was previously uploaded as a TRL
AutoModelForCausalLMWithValueHeadPEFT checkpoint, which required manual LoRA merging + value-head wiring before it could be used. It has now been replaced with a mergedGemma3ForSequenceClassificationso users can load it with one line. If you have an old checkout, rungit pullto get the new format.
Quick start
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("CraneAILabs/luganda-reward-model")
model = AutoModelForSequenceClassification.from_pretrained(
"CraneAILabs/luganda-reward-model",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
def score(prompt: str, response: str) -> float:
"""Higher score = better Luganda translation."""
text = f"{prompt}\n\n{response}"
inputs = tok(text, return_tensors="pt", truncation=True, max_length=512).to(model.device)
with torch.no_grad():
out = model(**inputs)
return out.logits[0].item()
# Examples
print(score("Translate to Luganda: The children are playing.", "Abaana bazannya.")) # +8.0 ← good
print(score("Translate to Luganda: I love my mother.", "Njagala maama wange.")) # +5.5 ← good
print(score("Translate to Luganda: I love my mother.", "Mama love I.")) # +1.7 ← garbled
print(score("Translate to Luganda: I love my mother.", "Sssss xxxxx zzzzz.")) # +1.1 ← gibberish
What changed in the 2026-04-09 update
| Old format (removed) | New format (current) | |
|---|---|---|
| Class | TRL AutoModelForCausalLMWithValueHead + PEFT LoRA wrapper |
Gemma3ForSequenceClassification |
| Loading | Manual PEFT load + LoRA merge + custom value_head wrapper | One line: AutoModelForSequenceClassification.from_pretrained(...) |
| State dict prefix | base_model.base_model.model.model.layers.{N}.{module}.{base_layer|lora_A.default|lora_B.default}.weight |
Standard model.layers.{N}.{module}.weight |
| Score head | Loose value_head.weight tensor (shape [1, 1152]) |
Wired in as model.score |
| Dtype | float32 weights | bfloat16 weights (half the size, same precision at inference) |
| File | pytorch_model.bin (4.0 GB) |
model.safetensors (2.0 GB) |
Score interpretation
After running on a small held-out set:
| Reward range | Interpretation |
|---|---|
| > 5.0 | Coherent, fluent Luganda translation |
| 2.0 – 5.0 | Luganda-shaped but possibly wrong meaning or partially correct |
| < 2.0 | Garbled, gibberish, or grossly incorrect |
Known weakness: untranslated English text scores moderately high (~+6), because the training data did not explicitly penalize untranslated input. Don't use this model alone to detect "did the LLM actually translate?" — pair with a language detector.
Training details
| Base model | CraneAILabs/ganda-gemma-1b (Luganda CPT of google/gemma-3-1b-it) |
| Dataset | CraneAILabs/pedagogy-luganda-reviewed (299 reviewed translation rows → 1,490 rated examples) |
| Eval set | Sunbird/salt (200 examples × 5 quality levels via rule-based degradation) |
| Method | LoRA SFT regression (rank=32, α=64), then merged into base |
| Loss | Weighted MSE on 1–5 ratings |
| Hyperparameters | LR 2e-5, bs 4 (effective 8), 5 epochs, 10% warmup |
For the full training writeup including a v1 failure analysis, see TRAINING_REPORT.md in the original repo.
Citation
@misc{craneailabs2026rewardmodel,
title={Luganda Translation Reward Model},
author={Bakunga, Bronson and Mubiru, Kato Steven and Tukamushaba, Catherine},
year={2026},
publisher={Crane AI Labs},
url={https://huggingface.co/CraneAILabs/luganda-reward-model}
}
License
Apache 2.0. Built on Gemma 3 — see Gemma terms of use.
- Downloads last month
- 69