--- license: other license_name: health-ai-developer-foundations license_link: https://developers.google.com/health-ai-developer-foundations/terms base_model: google/medgemma-27b-text-it tags: - medical - healthcare - maternal-health - sexual-health - reproductive-health - multilingual - african-languages - akan - amharic - luganda - swahili - lora - peft - medgemma language: - en - am - sw - lg - ak library_name: peft pipeline_tag: text-generation --- # MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge. Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in: - Akan (Twi/Fante from Ghana) - Amharic (Ethiopia) - Luganda (Uganda) - Swahili (Kenya) - English (Ethiopia, Ghana, Kenya, Uganda) ## Model Description LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs. ### Training Details - Base model: google/medgemma-27b-text-it (27B params, medical text-only) - Training method: QLoRA (4-bit quantization + LoRA) - LoRA config: r=8, alpha=16, attention-only modules - Trainable params: 16.7M (0.21% of total) - Training data: 29,815 multilingual medical Q&A samples - Optimizer: AdamW fused, lr=3e-5, linear warmup 5% - Hardware: NVIDIA A40 (48GB VRAM) - Final eval_loss: 1.39 ### Loss Trajectory | Step | eval_loss | |------|-----------| | 600 | 1.69 | | 900 | 1.58 | | 1200 | 1.50 | | 1500 | 1.45 | | 1800 | 1.42 | | 1864 | 1.39 (best) | ## Usage ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) base_model = AutoModelForCausalLM.from_pretrained( "google/medgemma-27b-text-it", device_map="auto", torch_dtype=torch.bfloat16, attn_implementation="eager", quantization_config=quantization_config, ) model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle") model.eval() tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle") # Example question = "How can young people access reproductive health services?" language = "English" prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}" messages = [{"role": "user", "content": prompt_text}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors='pt').to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=400, do_sample=False, num_beams=3, repetition_penalty=1.1, ) response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True) print(response) ``` ## Dataset Trained on the Zindi ITU Multilingual Health QA Challenge dataset: | Subset | Samples | Language | Region | |--------|---------|----------|--------| | Eng_Uga | 7,624 | English | Uganda | | Aka_Gha | 4,455 | Akan | Ghana | | Eng_Gha | 4,443 | English | Ghana | | Eng_Eth | 3,915 | English | Ethiopia | | Lug_Uga | 3,383 | Luganda | Uganda | | Eng_Ken | 2,080 | English | Kenya | | Swa_Ken | 2,070 | Swahili | Kenya | | Amh_Eth | 1,845 | Amharic | Ethiopia | ## Intended Use For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals. ## Limitations - May add English preamble at start of responses - Lower quality for Akan compared to English (less training data) - Trained for ~1.13 epochs only (compute constraints) - Best for MSRH topics ## Citation ``` @misc{medgemma27b-msrh-africa, author = {KYAGABA, Arul}, title = {MedGemma 27B - MSRH African Oracle}, year = {2026}, publisher = {HuggingFace}, howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle} } ``` ## Acknowledgements - Google for MedGemma 27B base model - Zindi and ITU for the multilingual health QA challenge - AfriMed-QA community for advancing African medical AI