Upload best checkpoint (eval_loss=1.39)

Browse files

Files changed (10) hide show

.gitattributes +1 -0
README.md +159 -0
adapter_config.json +34 -0
adapter_model.safetensors +3 -0
added_tokens.json +3 -0
chat_template.jinja +47 -0
special_tokens_map.json +33 -0
tokenizer.json +3 -0
tokenizer.model +3 -0
tokenizer_config.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,159 @@

+---
+license: other
+license_name: health-ai-developer-foundations
+license_link: https://developers.google.com/health-ai-developer-foundations/terms
+base_model: google/medgemma-27b-text-it
+tags:
+- medical
+- healthcare
+- maternal-health
+- sexual-health
+- reproductive-health
+- multilingual
+- african-languages
+- akan
+- amharic
+- luganda
+- swahili
+- lora
+- peft
+- medgemma
+language:
+- en
+- am
+- sw
+- lg
+- ak
+library_name: peft
+pipeline_tag: text-generation
+---
+# MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages
+Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.
+Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:
+- Akan (Twi/Fante from Ghana)
+- Amharic (Ethiopia)
+- Luganda (Uganda)
+- Swahili (Kenya)
+- English (Ethiopia, Ghana, Kenya, Uganda)
+## Model Description
+LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.
+### Training Details
+- Base model: google/medgemma-27b-text-it (27B params, medical text-only)
+- Training method: QLoRA (4-bit quantization + LoRA)
+- LoRA config: r=8, alpha=16, attention-only modules
+- Trainable params: 16.7M (0.21% of total)
+- Training data: 29,815 multilingual medical Q&A samples
+- Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
+- Hardware: NVIDIA A40 (48GB VRAM)
+- Final eval_loss: 1.39
+### Loss Trajectory
+| Step | eval_loss |
+|------|-----------|
+| 600  | 1.69 |
+| 900  | 1.58 |
+| 1200 | 1.50 |
+| 1500 | 1.45 |
+| 1800 | 1.42 |
+| 1864 | 1.39 (best) |
+## Usage
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+from peft import PeftModel
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.bfloat16,
+)
+base_model = AutoModelForCausalLM.from_pretrained(
+    "google/medgemma-27b-text-it",
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    attn_implementation="eager",
+    quantization_config=quantization_config,
+)
+model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
+model.eval()
+tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")
+# Example
+question = "How can young people access reproductive health services?"
+language = "English"
+prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
+messages = [{"role": "user", "content": prompt_text}]
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=400,
+        do_sample=False,
+        num_beams=3,
+        repetition_penalty=1.1,
+    )
+response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
+print(response)
+```
+## Dataset
+Trained on the Zindi ITU Multilingual Health QA Challenge dataset:
+| Subset | Samples | Language | Region |
+|--------|---------|----------|--------|
+| Eng_Uga | 7,624 | English | Uganda |
+| Aka_Gha | 4,455 | Akan | Ghana |
+| Eng_Gha | 4,443 | English | Ghana |
+| Eng_Eth | 3,915 | English | Ethiopia |
+| Lug_Uga | 3,383 | Luganda | Uganda |
+| Eng_Ken | 2,080 | English | Kenya |
+| Swa_Ken | 2,070 | Swahili | Kenya |
+| Amh_Eth | 1,845 | Amharic | Ethiopia |
+## Intended Use
+For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.
+## Limitations
+- May add English preamble at start of responses
+- Lower quality for Akan compared to English (less training data)
+- Trained for ~1.13 epochs only (compute constraints)
+- Best for MSRH topics
+## Citation
+```
+@misc{medgemma27b-msrh-africa,
+  author = {KYAGABA, Arul},
+  title = {MedGemma 27B - MSRH African Oracle},
+  year = {2026},
+  publisher = {HuggingFace},
+  howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
+}
+```
+## Acknowledgements
+- Google for MedGemma 27B base model
+- Zindi and ITU for the multilingual health QA challenge
+- AfriMed-QA community for advancing African medical AI

adapter_config.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "google/medgemma-27b-text-it",
+  "bias": "none",
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "o_proj",
+    "k_proj",
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3ddaecc0ef0e76dcea59a8d581f6cb808114e9b2c956390c0468e500a2f83c36
+size 67109592

added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "<image_soft_token>": 262144
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,47 @@

+{{ bos_token }}
+{%- if messages[0]['role'] == 'system' -%}
+    {%- if messages[0]['content'] is string -%}
+        {%- set first_user_prefix = messages[0]['content'] + '
+' -%}
+    {%- else -%}
+        {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
+' -%}
+    {%- endif -%}
+    {%- set loop_messages = messages[1:] -%}
+{%- else -%}
+    {%- set first_user_prefix = "" -%}
+    {%- set loop_messages = messages -%}
+{%- endif -%}
+{%- for message in loop_messages -%}
+    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
+        {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
+    {%- endif -%}
+    {%- if (message['role'] == 'assistant') -%}
+        {%- set role = "model" -%}
+    {%- else -%}
+        {%- set role = message['role'] -%}
+    {%- endif -%}
+    {{ '<start_of_turn>' + role + '
+' + (first_user_prefix if loop.first else "") }}
+    {%- if message['content'] is string -%}
+        {{ message['content'] | trim }}
+    {%- elif message['content'] is iterable -%}
+        {%- for item in message['content'] -%}
+            {%- if item['type'] == 'image' -%}
+                {{ '<start_of_image>' }}
+            {%- elif item['type'] == 'text' -%}
+                {{ item['text'] | trim }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{ raise_exception("Invalid content type") }}
+    {%- endif -%}
+    {{ '<end_of_turn>
+' }}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {{'<start_of_turn>model
+'}}
+{%- endif -%}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "boi_token": "<start_of_image>",
+  "bos_token": {
+    "content": "<bos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eoi_token": "<end_of_image>",
+  "eos_token": {
+    "content": "<eos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "image_token": "<image_soft_token>",
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
+size 33384568

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
+size 4689074

tokenizer_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff