Spaces:
Running
Running
| title: README | |
| emoji: 👁 | |
| colorFrom: purple | |
| colorTo: gray | |
| sdk: static | |
| pinned: false | |
| # 🩹 MedInjection-FR | |
| A **French biomedical instruction dataset and model suite** for studying how data provenance (**native, synthetic, translated**) impacts instruction-tuning of LLMs. | |
| ## 📊 Dataset Stats | |
| **Total size**: 571,436 instruction–response pairs | |
| **Components**: | |
| - Native: 77,247 | |
| - Synthetic: 76,506 | |
| - Translated: 417,674 | |
| **Tasks**: | |
| - MCQU (single-answer) | |
| - MCQ (multi-answer) | |
| - OEQ (open-ended) | |
| *** | |