File size: 4,362 Bytes

335708f

---
license: other
license_name: health-ai-developer-foundations
license_link: https://developers.google.com/health-ai-developer-foundations/terms
base_model: google/medgemma-27b-text-it
tags:
- medical
- healthcare
- maternal-health
- sexual-health
- reproductive-health
- multilingual
- african-languages
- akan
- amharic
- luganda
- swahili
- lora
- peft
- medgemma
language:
- en
- am
- sw
- lg
- ak
library_name: peft
pipeline_tag: text-generation
---

# MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages

Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.

Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:
- Akan (Twi/Fante from Ghana)
- Amharic (Ethiopia)
- Luganda (Uganda)
- Swahili (Kenya)
- English (Ethiopia, Ghana, Kenya, Uganda)

## Model Description

LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.

### Training Details

- Base model: google/medgemma-27b-text-it (27B params, medical text-only)
- Training method: QLoRA (4-bit quantization + LoRA)
- LoRA config: r=8, alpha=16, attention-only modules
- Trainable params: 16.7M (0.21% of total)
- Training data: 29,815 multilingual medical Q&A samples
- Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
- Hardware: NVIDIA A40 (48GB VRAM)
- Final eval_loss: 1.39

### Loss Trajectory

| Step | eval_loss |
|------|-----------|
| 600  | 1.69 |
| 900  | 1.58 |
| 1200 | 1.50 |
| 1500 | 1.45 |
| 1800 | 1.42 |
| 1864 | 1.39 (best) |

## Usage

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "google/medgemma-27b-text-it",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    attn_implementation="eager",
    quantization_config=quantization_config,
)

model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
model.eval()

tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")

# Example
question = "How can young people access reproductive health services?"
language = "English"

prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
messages = [{"role": "user", "content": prompt_text}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=400,
        do_sample=False,
        num_beams=3,
        repetition_penalty=1.1,
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
```

## Dataset

Trained on the Zindi ITU Multilingual Health QA Challenge dataset:

| Subset | Samples | Language | Region |
|--------|---------|----------|--------|
| Eng_Uga | 7,624 | English | Uganda |
| Aka_Gha | 4,455 | Akan | Ghana |
| Eng_Gha | 4,443 | English | Ghana |
| Eng_Eth | 3,915 | English | Ethiopia |
| Lug_Uga | 3,383 | Luganda | Uganda |
| Eng_Ken | 2,080 | English | Kenya |
| Swa_Ken | 2,070 | Swahili | Kenya |
| Amh_Eth | 1,845 | Amharic | Ethiopia |

## Intended Use

For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.

## Limitations

- May add English preamble at start of responses
- Lower quality for Akan compared to English (less training data)
- Trained for ~1.13 epochs only (compute constraints)
- Best for MSRH topics

## Citation

```
@misc{medgemma27b-msrh-africa,
  author = {KYAGABA, Arul},
  title = {MedGemma 27B - MSRH African Oracle},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
}
```

## Acknowledgements

- Google for MedGemma 27B base model
- Zindi and ITU for the multilingual health QA challenge
- AfriMed-QA community for advancing African medical AI