--- title: README emoji: 👁 colorFrom: purple colorTo: gray sdk: static pinned: false --- # 🩹 MedInjection-FR A **French biomedical instruction dataset and model suite** for studying how data provenance (**native, synthetic, translated**) impacts instruction-tuning of LLMs. ## 📊 Dataset Stats **Total size**: 571,436 instruction–response pairs **Components**: - Native: 77,247 - Synthetic: 76,506 - Translated: 417,674 **Tasks**: - MCQU (single-answer) - MCQ (multi-answer) - OEQ (open-ended) ***