MAQA-LLaMA — Arabic Medical Question Answering (Base Model)

⚠️ Disclaimer: This model is intended for research and informational purposes only. It is not a substitute for professional medical advice, diagnosis, or treatment. It cannot and should not be used to prescribe or recommend medications.


Model Summary

maqa_llama is a Meta Llama 3 8B Instruct model fine-tuned on the MAQA dataset (Medical Arabic Questions & Answers) — 430,000 real doctor-patient interactions across 20 medical specialisations, sourced from Arabic medical platforms.

This is the full-precision base model (BF16 / 16-bit), best suited for research and further fine-tuning. For deployment on consumer hardware, see the quantised variants below.

Property Value
Base model unsloth/llama-3-8b-Instruct-bnb-4bit (Meta Llama 3 8B Instruct)
Fine-tuning method QLoRA (via Unsloth)
Precision BF16 (merged 16-bit)
Model size 8B parameters
Language Arabic 🇸🇦
License Apache 2.0
Developed by Ali Abdelrasheed

Model Family

Model Format Size Best for
maqa_llamathis model BF16 SafeTensors ~16 GB Research / further fine-tuning
maqa_llama_4bit 4-bit (bitsandbytes) ~5 GB GPU inference
maqa_llama_4bit_GGUF GGUF q4_k_m 4.92 GB CPU / local deployment

Dataset — MAQA

The model was trained on MAQA (Medical Arabic Questions & Answers), the largest Arabic medical Q&A dataset available for NLP research.

Property Value
Total records 430,000 question-answer pairs
Columns question (patient) · answer (doctor diagnosis + treatment notes)
Medical specialisations 20 (cardiology, dermatology, neurology, gastroenterology, paediatrics, and more)
Sources altibbi.com · tbeeb.net · cura.healthcare
Language Arabic (Modern Standard Arabic)
Quality All questions unique and cleaned (not stemmed)
Training split 70% train / 30% evaluation

Dataset reference: "Deep learning for Arabic healthcare: MedicalBot" — Social Network Analysis and Mining, Springer (2023) Available on Harvard Dataverse


Training Details

LoRA Configuration

Hyperparameter Value
LoRA rank (r) 16
LoRA alpha 32
LoRA dropout 0 (optimised)
Bias none (optimised)
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Gradient checkpointing unsloth (30% less VRAM)
Random state 3407

Training Arguments

Hyperparameter Value
Epochs 1
Batch size (per device) 52
Gradient accumulation steps 1
Learning rate 2e-4
LR scheduler Linear
Warmup steps 200
Optimiser AdamW 8-bit
Max sequence length 2048
Evaluation strategy Every 500 steps
Training environment Google Colab Pro
Framework Unsloth + HuggingFace TRL (SFTTrainer)

Chat Template & System Prompt

The model was fine-tuned using the Llama-3 chat template with a custom Arabic medical system prompt:

أنت طبيب محترف ولديك خبرة في كل مجالات الطب.
يجيب على أسئلة المرضى حول الأمراض، باستخدام لهجة رسمية وودية،
وإجابات موجزة ومفيدة يسهل على الجميع فهمها.

"You are a professional doctor with expertise in all fields of medicine. Answer patients' questions about diseases using a formal and friendly tone, with concise and helpful answers that everyone can understand."

Special tokens <|question|> and <|answer|> were added to the tokenizer to clearly demarcate patient input and doctor response during training.


Training Journey

The development of this model went through multiple iterations:

  1. RAG approach (initial attempt): Explored Retrieval-Augmented Generation using LangChain and FAISS as a vector store. While technically functional, this approach had limitations in generalisation and was not suitable for a fine-tuned conversational model.

  2. Manual LoRA fine-tuning: Moved to direct supervised fine-tuning using LoRA for full control over the model weights on the MAQA dataset.

  3. Unsloth optimisation (final): Adopted the Unsloth framework to maximise training efficiency on Colab Pro's GPU resources — achieving 2x faster training with significantly reduced VRAM usage. This produced the current published models.


Quick Start

from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "AliAbdelrasheed/maqa_llama",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = False,  # Full precision
)

tokenizer = get_chat_template(tokenizer, chat_template="llama-3")
FastLanguageModel.for_inference(model)

messages = [
    {"from": "system", "value": "أنت طبيب محترف ولديك خبرة في كل مجالات الطب. يجيب على أسئلة المرضى حول الأمراض، باستخدام لهجة رسمية وودية، وإجابات موجزة ومفيدة يسهل على الجميع فهمها."},
    {"from": "human", "value": "ما هي أعراض مرض السكري من النوع الثاني؟"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

from transformers import TextStreamer
streamer = TextStreamer(tokenizer)
_ = model.generate(input_ids=inputs, streamer=streamer, max_new_tokens=256, use_cache=True)

Limitations

  • Not a substitute for professional medical advice or clinical diagnosis
  • Cannot prescribe, recommend, or endorse specific medications
  • Trained on a sampled subset of MAQA — performance may vary across all 20 specialisations
  • Optimised for Modern Standard Arabic; dialectal Arabic (Egyptian, Levantine, etc.) performance may vary
  • Web-scraped training data may contain noise or outdated medical information
  • Scarcity of high-quality Arabic medical data remains an open challenge in the field

Future Work

  • Fine-tuning on the full 430,000-row MAQA dataset (current model trained on a sampled subset)
  • Multi-turn conversational memory for sustained patient-doctor dialogue
  • Dialect-specific fine-tuning (Egyptian Arabic, Gulf Arabic)
  • Multimodal input support (e.g. dermatology image input)
  • Integration with electronic medical records (EMR) systems
  • Real-time knowledge updates from medical literature

Citation

If you use this model in your research, please cite the MAQA dataset:

@article{maqa2023,
  title={Deep learning for Arabic healthcare: MedicalBot},
  journal={Social Network Analysis and Mining},
  publisher={Springer},
  year={2023}
}

Developed By

Ali Abdelrasheed — Graduation Project Nile University · B.Sc. Information Technology – Big Data · Class of 2024 🤗 HuggingFace Profile

Downloads last month
24
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AliAbdelrasheed/maqa_llama

Finetuned
(1)
this model