KYAGABA commited on
Commit
335708f
·
verified ·
1 Parent(s): 05eaf78

Upload best checkpoint (eval_loss=1.39)

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: health-ai-developer-foundations
4
+ license_link: https://developers.google.com/health-ai-developer-foundations/terms
5
+ base_model: google/medgemma-27b-text-it
6
+ tags:
7
+ - medical
8
+ - healthcare
9
+ - maternal-health
10
+ - sexual-health
11
+ - reproductive-health
12
+ - multilingual
13
+ - african-languages
14
+ - akan
15
+ - amharic
16
+ - luganda
17
+ - swahili
18
+ - lora
19
+ - peft
20
+ - medgemma
21
+ language:
22
+ - en
23
+ - am
24
+ - sw
25
+ - lg
26
+ - ak
27
+ library_name: peft
28
+ pipeline_tag: text-generation
29
+ ---
30
+
31
+ # MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages
32
+
33
+ Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.
34
+
35
+ Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:
36
+ - Akan (Twi/Fante from Ghana)
37
+ - Amharic (Ethiopia)
38
+ - Luganda (Uganda)
39
+ - Swahili (Kenya)
40
+ - English (Ethiopia, Ghana, Kenya, Uganda)
41
+
42
+ ## Model Description
43
+
44
+ LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.
45
+
46
+ ### Training Details
47
+
48
+ - Base model: google/medgemma-27b-text-it (27B params, medical text-only)
49
+ - Training method: QLoRA (4-bit quantization + LoRA)
50
+ - LoRA config: r=8, alpha=16, attention-only modules
51
+ - Trainable params: 16.7M (0.21% of total)
52
+ - Training data: 29,815 multilingual medical Q&A samples
53
+ - Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
54
+ - Hardware: NVIDIA A40 (48GB VRAM)
55
+ - Final eval_loss: 1.39
56
+
57
+ ### Loss Trajectory
58
+
59
+ | Step | eval_loss |
60
+ |------|-----------|
61
+ | 600 | 1.69 |
62
+ | 900 | 1.58 |
63
+ | 1200 | 1.50 |
64
+ | 1500 | 1.45 |
65
+ | 1800 | 1.42 |
66
+ | 1864 | 1.39 (best) |
67
+
68
+ ## Usage
69
+
70
+ ```python
71
+ import torch
72
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
73
+ from peft import PeftModel
74
+
75
+ quantization_config = BitsAndBytesConfig(
76
+ load_in_4bit=True,
77
+ bnb_4bit_use_double_quant=True,
78
+ bnb_4bit_quant_type="nf4",
79
+ bnb_4bit_compute_dtype=torch.bfloat16,
80
+ )
81
+
82
+ base_model = AutoModelForCausalLM.from_pretrained(
83
+ "google/medgemma-27b-text-it",
84
+ device_map="auto",
85
+ torch_dtype=torch.bfloat16,
86
+ attn_implementation="eager",
87
+ quantization_config=quantization_config,
88
+ )
89
+
90
+ model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
91
+ model.eval()
92
+
93
+ tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")
94
+
95
+ # Example
96
+ question = "How can young people access reproductive health services?"
97
+ language = "English"
98
+
99
+ prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
100
+ messages = [{"role": "user", "content": prompt_text}]
101
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
102
+ inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
103
+
104
+ with torch.no_grad():
105
+ outputs = model.generate(
106
+ **inputs,
107
+ max_new_tokens=400,
108
+ do_sample=False,
109
+ num_beams=3,
110
+ repetition_penalty=1.1,
111
+ )
112
+
113
+ response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
114
+ print(response)
115
+ ```
116
+
117
+ ## Dataset
118
+
119
+ Trained on the Zindi ITU Multilingual Health QA Challenge dataset:
120
+
121
+ | Subset | Samples | Language | Region |
122
+ |--------|---------|----------|--------|
123
+ | Eng_Uga | 7,624 | English | Uganda |
124
+ | Aka_Gha | 4,455 | Akan | Ghana |
125
+ | Eng_Gha | 4,443 | English | Ghana |
126
+ | Eng_Eth | 3,915 | English | Ethiopia |
127
+ | Lug_Uga | 3,383 | Luganda | Uganda |
128
+ | Eng_Ken | 2,080 | English | Kenya |
129
+ | Swa_Ken | 2,070 | Swahili | Kenya |
130
+ | Amh_Eth | 1,845 | Amharic | Ethiopia |
131
+
132
+ ## Intended Use
133
+
134
+ For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.
135
+
136
+ ## Limitations
137
+
138
+ - May add English preamble at start of responses
139
+ - Lower quality for Akan compared to English (less training data)
140
+ - Trained for ~1.13 epochs only (compute constraints)
141
+ - Best for MSRH topics
142
+
143
+ ## Citation
144
+
145
+ ```
146
+ @misc{medgemma27b-msrh-africa,
147
+ author = {KYAGABA, Arul},
148
+ title = {MedGemma 27B - MSRH African Oracle},
149
+ year = {2026},
150
+ publisher = {HuggingFace},
151
+ howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
152
+ }
153
+ ```
154
+
155
+ ## Acknowledgements
156
+
157
+ - Google for MedGemma 27B base model
158
+ - Zindi and ITU for the multilingual health QA challenge
159
+ - AfriMed-QA community for advancing African medical AI
adapter_config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "google/medgemma-27b-text-it",
5
+ "bias": "none",
6
+ "eva_config": null,
7
+ "exclude_modules": null,
8
+ "fan_in_fan_out": false,
9
+ "inference_mode": true,
10
+ "init_lora_weights": true,
11
+ "layer_replication": null,
12
+ "layers_pattern": null,
13
+ "layers_to_transform": null,
14
+ "loftq_config": {},
15
+ "lora_alpha": 16,
16
+ "lora_bias": false,
17
+ "lora_dropout": 0.05,
18
+ "megatron_config": null,
19
+ "megatron_core": "megatron.core",
20
+ "modules_to_save": null,
21
+ "peft_type": "LORA",
22
+ "r": 8,
23
+ "rank_pattern": {},
24
+ "revision": null,
25
+ "target_modules": [
26
+ "o_proj",
27
+ "k_proj",
28
+ "q_proj",
29
+ "v_proj"
30
+ ],
31
+ "task_type": "CAUSAL_LM",
32
+ "use_dora": false,
33
+ "use_rslora": false
34
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ddaecc0ef0e76dcea59a8d581f6cb808114e9b2c956390c0468e500a2f83c36
3
+ size 67109592
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
chat_template.jinja ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {{ bos_token }}
2
+ {%- if messages[0]['role'] == 'system' -%}
3
+ {%- if messages[0]['content'] is string -%}
4
+ {%- set first_user_prefix = messages[0]['content'] + '
5
+
6
+ ' -%}
7
+ {%- else -%}
8
+ {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
9
+
10
+ ' -%}
11
+ {%- endif -%}
12
+ {%- set loop_messages = messages[1:] -%}
13
+ {%- else -%}
14
+ {%- set first_user_prefix = "" -%}
15
+ {%- set loop_messages = messages -%}
16
+ {%- endif -%}
17
+ {%- for message in loop_messages -%}
18
+ {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
19
+ {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
20
+ {%- endif -%}
21
+ {%- if (message['role'] == 'assistant') -%}
22
+ {%- set role = "model" -%}
23
+ {%- else -%}
24
+ {%- set role = message['role'] -%}
25
+ {%- endif -%}
26
+ {{ '<start_of_turn>' + role + '
27
+ ' + (first_user_prefix if loop.first else "") }}
28
+ {%- if message['content'] is string -%}
29
+ {{ message['content'] | trim }}
30
+ {%- elif message['content'] is iterable -%}
31
+ {%- for item in message['content'] -%}
32
+ {%- if item['type'] == 'image' -%}
33
+ {{ '<start_of_image>' }}
34
+ {%- elif item['type'] == 'text' -%}
35
+ {{ item['text'] | trim }}
36
+ {%- endif -%}
37
+ {%- endfor -%}
38
+ {%- else -%}
39
+ {{ raise_exception("Invalid content type") }}
40
+ {%- endif -%}
41
+ {{ '<end_of_turn>
42
+ ' }}
43
+ {%- endfor -%}
44
+ {%- if add_generation_prompt -%}
45
+ {{'<start_of_turn>model
46
+ '}}
47
+ {%- endif -%}
special_tokens_map.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "unk_token": {
27
+ "content": "<unk>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ }
33
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff