File size: 2,032 Bytes

---
library_name: transformers
tags:
- medical
license: mit
datasets:
- MedInjection-FR/Native
- MedInjection-FR/Translated
language:
- fr
- en
base_model:
- Qwen/Qwen3-4B-Instruct-2507
---



# 🩺 QWEN-4B-NAT

**QWEN-4B-NAT** is a fine-tuned version of **Qwen-4B-Instruct** trained on the [MedInjection-FR](https://huggingface.co/MedInjection-FR) dataset, a French biomedical instruction corpus combining *native, synthetic, and translated* medical question–answer pairs.  
This model was fine-tuned using **Supervised Fine-Tuning (SFT)** with **DoRA adapters**, designed to study how the origin of supervision data influences model adaptation.

---

## 🧠 Model overview

| Property | Description |
|-----------|--------------|
| **Base model** | Qwen3-4B-Instruct-2507 |
| **Fine-tuning method** | DoRA (Weight-Decomposed Low-Rank Adaptation) |
| **Architecture size** | ~4B parameters |
| **Language** | French 🇫🇷 |
| **Domain** | Biomedical, Clinical, Health |
| **Intended use** | Research on instruction tuning and domain adaptation |
| **Caution** | Not for clinical or diagnostic use |

---

## ⚙️ Training setup

Fine-tuning was performed on **30k multiple-choice (MCQ and MCQU)** examples for each configuration, using:
- 10 epochs  
- Batch size: 12  
- Learning rate: 1e-4  
- Gradient accumulation: 8  
- Cosine scheduler with 5% warmup  
- LoRA rank: 16, α = 16, dropout = 0.05  
- Adapters applied to: `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`

All runs used identical hyperparameters to isolate the effect of **data provenance**.

---

## 📊 Evaluation summary

Evaluation was conducted on French biomedical benchmarks (MCQ, MCQU, OEQ).  
Metrics include **Exact Match (EM)** and **Hamming Score** for multiple-choice tasks, and **BLEU/ROUGE/BERTScore + LLM-as-a-judge** for open-ended QA.  

> See [MedInjection-FR GitHub](https://github.com/yourusername/MedInjection-FR) for full results and plots.



## 📚 Citation

If you use this model, please cite:

```bibtex

```