Replace PEFT+ValueHead format with merged Gemma3ForSequenceClassification (one-line load)
Browse files- README.md +74 -75
- config.json +25 -10
- model.safetensors +3 -0
- tokenizer.json +2 -2
- tokenizer_config.json +0 -0
README.md
CHANGED
|
@@ -4,106 +4,105 @@ language:
|
|
| 4 |
- lug
|
| 5 |
- en
|
| 6 |
tags:
|
| 7 |
-
- reward-model
|
| 8 |
- luganda
|
| 9 |
-
-
|
|
|
|
| 10 |
- rlhf
|
| 11 |
-
-
|
| 12 |
-
-
|
| 13 |
-
-
|
|
|
|
|
|
|
|
|
|
| 14 |
base_model: CraneAILabs/ganda-gemma-1b
|
| 15 |
pipeline_tag: text-classification
|
|
|
|
| 16 |
---
|
| 17 |
|
| 18 |
-
# Luganda Translation Reward Model
|
| 19 |
-
|
| 20 |
-
A pairwise margin-ranking reward model for evaluating English-to-Luganda translation quality. Trained on the Ganda Gemma 1B base using LoRA (rank=32). Designed as the RLHF reward signal for improving Luganda translation models.
|
| 21 |
-
|
| 22 |
-
## Model Description
|
| 23 |
-
|
| 24 |
-
- **Base model:** `CraneAILabs/ganda-gemma-1b`
|
| 25 |
-
- **Method:** Pairwise margin ranking (Llama 2 style)
|
| 26 |
-
- **Loss:** `-log(sigmoid(r_chosen - r_rejected - margin))`
|
| 27 |
-
- **Parameters:** ~2.7% trainable via LoRA (rank=32, alpha=64)
|
| 28 |
-
- **Training:** 1 epoch, LR=1e-5, dropout=0.2, weight_decay=0.1
|
| 29 |
-
- **Best checkpoint:** Step 900 (eval_loss=0.6787)
|
| 30 |
-
|
| 31 |
-
## Training Data
|
| 32 |
-
|
| 33 |
-
**10,856 pairwise comparisons** constructed from 1,490 rated translation examples:
|
| 34 |
-
|
| 35 |
-
- 299 English sentences × 5 translation variants each = 1,490 rated examples
|
| 36 |
-
- Ratings from professional Luganda translators (1-5 scale)
|
| 37 |
-
- 856 additional reviewer correction pairs
|
| 38 |
-
- Cartesian cross-bucket pairing with gap ≥ 2 quality levels
|
| 39 |
-
- Margins: 0.50 (gap=2), 0.75 (gap=3), 1.00 (gap=4)
|
| 40 |
-
- Train/eval split: 9,770 / 1,086
|
| 41 |
-
|
| 42 |
-
### Version History
|
| 43 |
-
|
| 44 |
-
**V1 (Failed):** Weighted MSE regression. 856 reviewer corrections were all labeled 5.0, inflating that class from 5.4% to ~40%. SALT quality-level separation *decreased* during training (0.51 → 0.30). Abandoned.
|
| 45 |
|
| 46 |
-
|
|
|
|
| 47 |
|
| 48 |
-
|
| 49 |
|
| 50 |
-
|
| 51 |
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
-
##
|
| 60 |
|
| 61 |
-
|
|
| 62 |
-
|------
|
| 63 |
-
|
|
| 64 |
-
|
|
| 65 |
-
|
|
| 66 |
-
|
|
| 67 |
-
|
|
|
|
|
| 68 |
|
| 69 |
-
##
|
| 70 |
|
| 71 |
-
|
| 72 |
-
- Automatic translation quality scoring (RL rejection threshold: score < 3.0)
|
| 73 |
-
- Research on reward modeling for low-resource African languages
|
| 74 |
|
| 75 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
-
|
| 78 |
-
- Rating data from a limited pool of translators — may not capture all dialect preferences
|
| 79 |
-
- Correction pair accuracy varies: 99.5% on rating pairs but only 61.1% on some correction categories
|
| 80 |
-
- The model evaluates translation quality, not fluency or cultural appropriateness separately
|
| 81 |
|
| 82 |
-
##
|
| 83 |
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
|
| 89 |
-
model = PeftModel.from_pretrained(base, "CraneAILabs/luganda-reward-model")
|
| 90 |
-
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/luganda-reward-model")
|
| 91 |
-
|
| 92 |
-
# Score a translation
|
| 93 |
-
text = "<start_of_turn>user\nTranslate to Luganda: Hello\n<end_of_turn>\n<start_of_turn>model\nOli otya<end_of_turn>"
|
| 94 |
-
inputs = tokenizer(text, return_tensors="pt")
|
| 95 |
-
score = model(**inputs).logits.item()
|
| 96 |
-
# Apply tanh normalization: tanh(score / 4.0) * 5.0
|
| 97 |
-
```
|
| 98 |
|
| 99 |
## Citation
|
| 100 |
|
| 101 |
```bibtex
|
| 102 |
-
@misc{
|
| 103 |
-
title={
|
| 104 |
author={Bakunga, Bronson and Mubiru, Kato Steven and Tukamushaba, Catherine},
|
| 105 |
year={2026},
|
| 106 |
publisher={Crane AI Labs},
|
| 107 |
-
url={https://huggingface.co/CraneAILabs/luganda-reward-model}
|
| 108 |
}
|
| 109 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
- lug
|
| 5 |
- en
|
| 6 |
tags:
|
|
|
|
| 7 |
- luganda
|
| 8 |
+
- reward-model
|
| 9 |
+
- reward-modeling
|
| 10 |
- rlhf
|
| 11 |
+
- grpo
|
| 12 |
+
- dpo
|
| 13 |
+
- gemma
|
| 14 |
+
- gemma3
|
| 15 |
+
- translation-quality
|
| 16 |
+
- africa
|
| 17 |
base_model: CraneAILabs/ganda-gemma-1b
|
| 18 |
pipeline_tag: text-classification
|
| 19 |
+
library_name: transformers
|
| 20 |
---
|
| 21 |
|
| 22 |
+
# Luganda Translation Reward Model (merged)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
+
A 1B parameter Gemma 3 reward model that scores English→Luganda translation quality.
|
| 25 |
+
Outputs a scalar reward — higher = better translation.
|
| 26 |
|
| 27 |
+
This is the **merged, ready-to-use version** of [`CraneAILabs/luganda-reward-model`](https://huggingface.co/CraneAILabs/luganda-reward-model). The original repo was uploaded as a TRL `AutoModelForCausalLMWithValueHead` PEFT checkpoint, which required manual LoRA merging + value-head wiring before it could be used. **This repo bakes those fixups in** so users can load it with one line.
|
| 28 |
|
| 29 |
+
## Quick start
|
| 30 |
|
| 31 |
+
```python
|
| 32 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 33 |
+
import torch
|
| 34 |
+
|
| 35 |
+
tok = AutoTokenizer.from_pretrained("CraneAILabs/luganda-reward-model-merged")
|
| 36 |
+
model = AutoModelForSequenceClassification.from_pretrained(
|
| 37 |
+
"CraneAILabs/luganda-reward-model-merged",
|
| 38 |
+
torch_dtype=torch.bfloat16,
|
| 39 |
+
device_map="auto",
|
| 40 |
+
)
|
| 41 |
+
model.eval()
|
| 42 |
+
|
| 43 |
+
def score(prompt: str, response: str) -> float:
|
| 44 |
+
"""Higher score = better Luganda translation."""
|
| 45 |
+
text = f"{prompt}\n\n{response}"
|
| 46 |
+
inputs = tok(text, return_tensors="pt", truncation=True, max_length=512).to(model.device)
|
| 47 |
+
with torch.no_grad():
|
| 48 |
+
out = model(**inputs)
|
| 49 |
+
return out.logits[0].item()
|
| 50 |
+
|
| 51 |
+
# Examples
|
| 52 |
+
print(score("Translate to Luganda: The children are playing.", "Abaana bazannya.")) # +8.0 ← good
|
| 53 |
+
print(score("Translate to Luganda: I love my mother.", "Njagala maama wange.")) # +5.5 ← good
|
| 54 |
+
print(score("Translate to Luganda: I love my mother.", "Mama love I.")) # +1.7 ← garbled
|
| 55 |
+
print(score("Translate to Luganda: I love my mother.", "Sssss xxxxx zzzzz.")) # +1.1 ← gibberish
|
| 56 |
+
```
|
| 57 |
|
| 58 |
+
## How it differs from the original repo
|
| 59 |
|
| 60 |
+
| | Original | Merged (this repo) |
|
| 61 |
+
|---|---|---|
|
| 62 |
+
| **Format** | TRL `AutoModelForCausalLMWithValueHead` + PEFT LoRA | `Gemma3ForSequenceClassification` |
|
| 63 |
+
| **Loading** | Requires manual PEFT load + LoRA merge + custom value_head wrapper | One line: `AutoModelForSequenceClassification.from_pretrained(...)` |
|
| 64 |
+
| **State dict prefix** | `base_model.base_model.model.model.layers.{N}.{module}.{base_layer\|lora_A.default\|lora_B.default}.weight` | Standard `model.layers.{N}.{module}.weight` |
|
| 65 |
+
| **Score head** | Loose `value_head.weight` tensor (shape `[1, 1152]`) | Wired in as `model.score` |
|
| 66 |
+
| **Quantization** | float32 weights | bfloat16 weights (half the size, same precision for inference) |
|
| 67 |
+
| **File size** | 4.0 GB pytorch_model.bin | 2.0 GB model.safetensors |
|
| 68 |
|
| 69 |
+
## Score interpretation
|
| 70 |
|
| 71 |
+
After running on a small held-out set:
|
|
|
|
|
|
|
| 72 |
|
| 73 |
+
| Reward range | Interpretation |
|
| 74 |
+
|---|---|
|
| 75 |
+
| **> 5.0** | Coherent, fluent Luganda translation |
|
| 76 |
+
| **2.0 – 5.0** | Luganda-shaped but possibly wrong meaning or partially correct |
|
| 77 |
+
| **< 2.0** | Garbled, gibberish, or grossly incorrect |
|
| 78 |
|
| 79 |
+
**Known weakness**: untranslated English text scores moderately high (~+6), because the training data did not explicitly penalize untranslated input. Don't use this model alone to detect "did the LLM actually translate?" — pair with a language detector.
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
+
## Training details
|
| 82 |
|
| 83 |
+
| | |
|
| 84 |
+
|---|---|
|
| 85 |
+
| Base model | `CraneAILabs/ganda-gemma-1b` (Luganda CPT of `google/gemma-3-1b-it`) |
|
| 86 |
+
| Dataset | `CraneAILabs/pedagogy-luganda-reviewed` (299 reviewed translation rows → 1,490 rated examples) |
|
| 87 |
+
| Eval set | `Sunbird/salt` (200 examples × 5 quality levels via rule-based degradation) |
|
| 88 |
+
| Method | LoRA SFT regression (rank=32, α=64), then merged into base |
|
| 89 |
+
| Loss | Weighted MSE on 1–5 ratings |
|
| 90 |
+
| Hyperparameters | LR 2e-5, bs 4 (effective 8), 5 epochs, 10% warmup |
|
| 91 |
|
| 92 |
+
For the full training writeup including a v1 failure analysis, see [`TRAINING_REPORT.md`](https://huggingface.co/CraneAILabs/luganda-reward-model/blob/main/TRAINING_REPORT.md) in the original repo.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
## Citation
|
| 95 |
|
| 96 |
```bibtex
|
| 97 |
+
@misc{craneailabs2026rewardmodel,
|
| 98 |
+
title={Luganda Translation Reward Model},
|
| 99 |
author={Bakunga, Bronson and Mubiru, Kato Steven and Tukamushaba, Catherine},
|
| 100 |
year={2026},
|
| 101 |
publisher={Crane AI Labs},
|
| 102 |
+
url={https://huggingface.co/CraneAILabs/luganda-reward-model-merged}
|
| 103 |
}
|
| 104 |
```
|
| 105 |
+
|
| 106 |
+
## License
|
| 107 |
+
|
| 108 |
+
Apache 2.0. Built on Gemma 3 — see [Gemma terms of use](https://ai.google.dev/gemma/terms).
|
config.json
CHANGED
|
@@ -1,19 +1,27 @@
|
|
| 1 |
{
|
|
|
|
| 2 |
"architectures": [
|
| 3 |
-
"
|
| 4 |
],
|
| 5 |
"attention_bias": false,
|
| 6 |
"attention_dropout": 0.0,
|
| 7 |
"attn_logit_softcapping": null,
|
| 8 |
"bos_token_id": 2,
|
| 9 |
"cache_implementation": "hybrid",
|
|
|
|
| 10 |
"eos_token_id": 106,
|
| 11 |
"final_logit_softcapping": null,
|
| 12 |
"head_dim": 256,
|
| 13 |
"hidden_activation": "gelu_pytorch_tanh",
|
| 14 |
"hidden_size": 1152,
|
|
|
|
|
|
|
|
|
|
| 15 |
"initializer_range": 0.02,
|
| 16 |
"intermediate_size": 6912,
|
|
|
|
|
|
|
|
|
|
| 17 |
"layer_types": [
|
| 18 |
"sliding_attention",
|
| 19 |
"sliding_attention",
|
|
@@ -48,19 +56,26 @@
|
|
| 48 |
"num_hidden_layers": 26,
|
| 49 |
"num_key_value_heads": 1,
|
| 50 |
"pad_token_id": 0,
|
|
|
|
| 51 |
"query_pre_attn_scalar": 256,
|
| 52 |
"rms_norm_eps": 1e-06,
|
| 53 |
-
"
|
| 54 |
-
|
| 55 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
"sliding_window": 512,
|
| 57 |
"sliding_window_pattern": 6,
|
| 58 |
-
"
|
| 59 |
-
"transformers_version": "
|
| 60 |
"unsloth_fixed": true,
|
| 61 |
"unsloth_version": "2025.6.7",
|
|
|
|
| 62 |
"use_cache": true,
|
| 63 |
-
"vocab_size": 262144
|
| 64 |
-
|
| 65 |
-
"problem_type": "regression"
|
| 66 |
-
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"_sliding_window_pattern": 6,
|
| 3 |
"architectures": [
|
| 4 |
+
"Gemma3TextForSequenceClassification"
|
| 5 |
],
|
| 6 |
"attention_bias": false,
|
| 7 |
"attention_dropout": 0.0,
|
| 8 |
"attn_logit_softcapping": null,
|
| 9 |
"bos_token_id": 2,
|
| 10 |
"cache_implementation": "hybrid",
|
| 11 |
+
"dtype": "bfloat16",
|
| 12 |
"eos_token_id": 106,
|
| 13 |
"final_logit_softcapping": null,
|
| 14 |
"head_dim": 256,
|
| 15 |
"hidden_activation": "gelu_pytorch_tanh",
|
| 16 |
"hidden_size": 1152,
|
| 17 |
+
"id2label": {
|
| 18 |
+
"0": "LABEL_0"
|
| 19 |
+
},
|
| 20 |
"initializer_range": 0.02,
|
| 21 |
"intermediate_size": 6912,
|
| 22 |
+
"label2id": {
|
| 23 |
+
"LABEL_0": 0
|
| 24 |
+
},
|
| 25 |
"layer_types": [
|
| 26 |
"sliding_attention",
|
| 27 |
"sliding_attention",
|
|
|
|
| 56 |
"num_hidden_layers": 26,
|
| 57 |
"num_key_value_heads": 1,
|
| 58 |
"pad_token_id": 0,
|
| 59 |
+
"problem_type": "regression",
|
| 60 |
"query_pre_attn_scalar": 256,
|
| 61 |
"rms_norm_eps": 1e-06,
|
| 62 |
+
"rope_parameters": {
|
| 63 |
+
"full_attention": {
|
| 64 |
+
"rope_theta": 1000000,
|
| 65 |
+
"rope_type": "default"
|
| 66 |
+
},
|
| 67 |
+
"sliding_attention": {
|
| 68 |
+
"rope_theta": 10000,
|
| 69 |
+
"rope_type": "default"
|
| 70 |
+
}
|
| 71 |
+
},
|
| 72 |
"sliding_window": 512,
|
| 73 |
"sliding_window_pattern": 6,
|
| 74 |
+
"tie_word_embeddings": true,
|
| 75 |
+
"transformers_version": "5.5.0",
|
| 76 |
"unsloth_fixed": true,
|
| 77 |
"unsloth_version": "2025.6.7",
|
| 78 |
+
"use_bidirectional_attention": false,
|
| 79 |
"use_cache": true,
|
| 80 |
+
"vocab_size": 262144
|
| 81 |
+
}
|
|
|
|
|
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e36807f690164ff6c7544f5880f95d4da7f87e2a4fc43861cf3b014eda6135fc
|
| 3 |
+
size 1999813600
|
tokenizer.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f4708757955e49e5b23494815a523ffa5bdd0a7b67c09d16a093f6151245ec5b
|
| 3 |
+
size 33384665
|
tokenizer_config.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|