Text Generation
PEFT
Safetensors
medical
healthcare
maternal-health
sexual-health
reproductive-health
multilingual
african-languages
akan
amharic
luganda
swahili
lora
medgemma
conversational
Instructions to use KYAGABA/testmodel with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use KYAGABA/testmodel with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/medgemma-27b-text-it") model = PeftModel.from_pretrained(base_model, "KYAGABA/testmodel") - Notebooks
- Google Colab
- Kaggle
File size: 4,362 Bytes
335708f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | ---
license: other
license_name: health-ai-developer-foundations
license_link: https://developers.google.com/health-ai-developer-foundations/terms
base_model: google/medgemma-27b-text-it
tags:
- medical
- healthcare
- maternal-health
- sexual-health
- reproductive-health
- multilingual
- african-languages
- akan
- amharic
- luganda
- swahili
- lora
- peft
- medgemma
language:
- en
- am
- sw
- lg
- ak
library_name: peft
pipeline_tag: text-generation
---
# MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages
Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.
Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:
- Akan (Twi/Fante from Ghana)
- Amharic (Ethiopia)
- Luganda (Uganda)
- Swahili (Kenya)
- English (Ethiopia, Ghana, Kenya, Uganda)
## Model Description
LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.
### Training Details
- Base model: google/medgemma-27b-text-it (27B params, medical text-only)
- Training method: QLoRA (4-bit quantization + LoRA)
- LoRA config: r=8, alpha=16, attention-only modules
- Trainable params: 16.7M (0.21% of total)
- Training data: 29,815 multilingual medical Q&A samples
- Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
- Hardware: NVIDIA A40 (48GB VRAM)
- Final eval_loss: 1.39
### Loss Trajectory
| Step | eval_loss |
|------|-----------|
| 600 | 1.69 |
| 900 | 1.58 |
| 1200 | 1.50 |
| 1500 | 1.45 |
| 1800 | 1.42 |
| 1864 | 1.39 (best) |
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(
"google/medgemma-27b-text-it",
device_map="auto",
torch_dtype=torch.bfloat16,
attn_implementation="eager",
quantization_config=quantization_config,
)
model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
model.eval()
tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")
# Example
question = "How can young people access reproductive health services?"
language = "English"
prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
messages = [{"role": "user", "content": prompt_text}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=400,
do_sample=False,
num_beams=3,
repetition_penalty=1.1,
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
```
## Dataset
Trained on the Zindi ITU Multilingual Health QA Challenge dataset:
| Subset | Samples | Language | Region |
|--------|---------|----------|--------|
| Eng_Uga | 7,624 | English | Uganda |
| Aka_Gha | 4,455 | Akan | Ghana |
| Eng_Gha | 4,443 | English | Ghana |
| Eng_Eth | 3,915 | English | Ethiopia |
| Lug_Uga | 3,383 | Luganda | Uganda |
| Eng_Ken | 2,080 | English | Kenya |
| Swa_Ken | 2,070 | Swahili | Kenya |
| Amh_Eth | 1,845 | Amharic | Ethiopia |
## Intended Use
For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.
## Limitations
- May add English preamble at start of responses
- Lower quality for Akan compared to English (less training data)
- Trained for ~1.13 epochs only (compute constraints)
- Best for MSRH topics
## Citation
```
@misc{medgemma27b-msrh-africa,
author = {KYAGABA, Arul},
title = {MedGemma 27B - MSRH African Oracle},
year = {2026},
publisher = {HuggingFace},
howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
}
```
## Acknowledgements
- Google for MedGemma 27B base model
- Zindi and ITU for the multilingual health QA challenge
- AfriMed-QA community for advancing African medical AI
|