YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MAQA-LLaMA-4bit — Arabic Medical Q&A (4-bit GPU Inference)
⚠️ Disclaimer: This model is intended for research and informational purposes only. It is not a substitute for professional medical advice, diagnosis, or treatment. It cannot and should not be used to prescribe or recommend medications.
Model Summary
maqa_llama_4bit is the 4-bit GPU inference variant of the MAQA-LLaMA family —
a Llama 3 8B model fine-tuned on 430,000 real Arabic doctor-patient interactions across
20 medical specialisations. Runs on consumer-grade GPUs with 6–8 GB VRAM.
| Property | Value |
|---|---|
| Base model | unsloth/llama-3-8b-Instruct-bnb-4bit (Meta Llama 3 8B Instruct) |
| Fine-tuning method | QLoRA (via Unsloth) |
| Quantisation | 4-bit (bitsandbytes, merged_4bit_forced) |
| Tensor types | F16 / F32 / U8 |
| Model size | 8B parameters |
| Language | Arabic 🇸🇦 |
| License | Apache 2.0 |
| Developed by | Ali Abdelrasheed |
Model Family
| Model | Format | Size | Best for |
|---|---|---|---|
maqa_llama |
BF16 SafeTensors | ~16 GB | Research / further fine-tuning |
maqa_llama_4bit ← this model |
4-bit bitsandbytes | ~5 GB | ✅ GPU inference |
maqa_llama_4bit_GGUF |
GGUF q4_k_m | 4.92 GB | CPU / local deployment |
Quick Start
With Unsloth (recommended — 2x faster inference)
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from transformers import TextStreamer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "AliAbdelrasheed/maqa_llama_4bit",
max_seq_length = 2048,
dtype = None, # Auto-detect: Float16 for T4/V100, BFloat16 for Ampere+
load_in_4bit = True,
)
tokenizer = get_chat_template(
tokenizer,
chat_template = "llama-3",
mapping = {"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
)
FastLanguageModel.for_inference(model)
messages = [
{
"from": "system",
"value": "أنت طبيب محترف ولديك خبرة في كل مجالات الطب. يجيب على أسئلة المرضى حول الأمراض، باستخدام لهجة رسمية وودية، وإجابات موجزة ومفيدة يسهل على الجميع فهمها."
},
{
"from": "human",
"value": "أشعر بألم أسفل البطن بخاصرتي والألم يجي على فترات، ما الأسباب؟"
},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
streamer = TextStreamer(tokenizer)
_ = model.generate(
input_ids = inputs,
streamer = streamer,
max_new_tokens = 256,
use_cache = True,
)
With Transformers + bitsandbytes (no Unsloth required)
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("AliAbdelrasheed/maqa_llama_4bit")
model = AutoModelForCausalLM.from_pretrained(
"AliAbdelrasheed/maqa_llama_4bit",
load_in_4bit = True,
device_map = "auto",
)
prompt = "ما هي أعراض مرض السكري من النوع الثاني وكيف يمكن التعامل معه؟"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Dataset — MAQA
| Property | Value |
|---|---|
| Total records | 430,000 question-answer pairs |
| Columns | Patient question · Doctor diagnosis · Doctor treatment notes |
| Sources | altibbi.com · tbeeb.net · cura.healthcare |
| Specialisations | 20 medical fields |
| Language | Modern Standard Arabic |
| Training split | 70% train / 30% evaluation |
Dataset: "Deep learning for Arabic healthcare: MedicalBot" — Springer (2023) Harvard Dataverse
Training Details
LoRA Configuration
| Hyperparameter | Value |
|---|---|
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Gradient checkpointing | unsloth |
Training Arguments
| Hyperparameter | Value |
|---|---|
| Epochs | 1 |
| Batch size | 52 |
| Learning rate | 2e-4 |
| LR scheduler | Linear |
| Warmup steps | 200 |
| Optimiser | AdamW 8-bit |
| Max sequence length | 2048 |
| Training environment | Google Colab Pro |
Full training details and notebook available on GitHub.
Limitations
- Not a substitute for professional medical advice or clinical diagnosis
- Cannot prescribe or recommend medications
- Trained on a sampled subset of MAQA; performance may vary across all 20 specialisations
- Optimised for Modern Standard Arabic; dialectal Arabic performance may vary
- Web-scraped data may contain noise or outdated medical information
Developed By
Ali Abdelrasheed — Graduation Project Nile University · B.Sc. Information Technology – Big Data · Class of 2024 🤗 HuggingFace Profile
- Downloads last month
- 13