Ganda Gemma FLN Bridge

The best-performing model from the FabAI Foundational Literacy & Numeracy project. A 1B parameter bilingual (English/Luganda) model for generating pedagogical content aligned to Uganda's P1–P3 curriculum.

Model Description

Architecture: Gemma 3 1B (Gemma3ForCausalLM)
Method: Linear weight interpolation (70% Learner + 30% GRPO-600)
Base: CraneAILabs/ganda-gemma-1b (Luganda continual pre-training of google/gemma-3-1b-it)
No additional training — Bridge is a merge, not a fine-tuned model

Lineage

google/gemma-3-1b-it
  → Luganda CPT (1.33M tokens, 70/30 Luganda/English mix)
  → CraneAILabs/ganda-gemma-1b
    → Learner: SFT on 17,561 FLN items (MCQ + content generation)
    → GRPO-600: Reinforcement learning on tool-calling + translation
    → Bridge: 70% Learner + 30% GRPO-600 (linear merge)

Evaluation Results

Evaluated on clean benchmarks with verified zero data contamination:

Metric	Bridge	Base (unmodified)	Gap Closed
Pedagogical Content Knowledge (PCK)	66%	51%	44% of gap to 12B
Luganda Linguistic Understanding (ELL MC)	58.8%	39%	90% of gap to 12B
ELL Overall	31.0%	—	—
Luganda generation quality	Partial	—	Structured lesson plans

Comparison to Other Variants

Model	Method	PCK	ELL MC	Status
Bridge (v6-07)	70/30 merge	66%	58.8%	Shipped
Reader (v5-sft)	SFT only	66%	51.0%	Tied PCK, lower ELL
Speaker (v6-fln-on-grpo)	SFT on RL base	64%	47.1%	Lower on both
Scholar (v7-fln)	SFT on Bridge	71%	41.2%	Rejected (ELL collapse)

Known Limitations

Position bias: 52-point accuracy spread between best position (B: 93%) and worst (D: 41%). Retraining with position-balanced MCQ data is the clearest fix.
Short-form ELL: ~0% on 47 short-form items across all 1B variants. Requires targeted training data.
Arithmetic: Cannot reliably multiply two-digit numbers at 1B parameters.
Long-context Luganda: Coherence degrades beyond ~500 tokens of Luganda output.
Requires repetition_penalty=1.2 for stable Luganda generation (without it, output loops).

Intended Use

Generating structured bilingual lesson plans for Ugandan primary school teachers
Creating literacy assessments (MCQ, fill-in-blank) aligned to P1–P3 curriculum
Offline teacher assistant on mobile devices (~30 tokens/sec on phone hardware)
Research on low-resource language educational AI

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("CraneAILabs/ganda-gemma-fln-bridge")
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/ganda-gemma-fln-bridge")

prompt = "<start_of_turn>user\nCreate a P2 phonics lesson on syllable segmentation in Luganda<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7, repetition_penalty=1.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@misc{craneailabs2026bridge,
  title={Bridge: A 1B Bilingual Model for Ugandan Primary School Literacy Instruction},
  author={Bakunga, Bronson and Mubiru, Kato Steven and Tukamushaba, Catherine},
  year={2026},
  publisher={Crane AI Labs},
  url={https://huggingface.co/CraneAILabs/ganda-gemma-fln-bridge}
}