Post
160
Introducing Reasoning-Medical-27B is designed for advanced medical reasoning in professional medicine, medical genetics, college biology/medicine, and clinical knowledge. The model was fine-tuned on a large-scale dataset of 370,000 high-quality question-and-answer examples, incorporating Chain-of-Thought reasoning to improve step-by-step problem solving. Training was performed using the GRPO trainer with the Unsloth optimization method for efficient fine-tuning.
MedQA: 93% vs MedGemma 85.3%
Model: EpistemeAI/Reasoning-Medical-27B
MedQA: 93% vs MedGemma 85.3%
Model: EpistemeAI/Reasoning-Medical-27B
# Benchmark
| Task | Version | Filter | n-shot | Metric | Direction | Reasoning Medical 27B | Qwen 3.6 27B | MedGemma 1 27B |
|-------------------|--------:|----------------|-------:|-------------|:---------:|----------------------:|-------------:|----------------:|
| MMLU-Pro Biology | 3.1 | custom-extract | 2 | exact_match | ↑ | 0.85 | — | — |
| MMLU-ProX Biology | 0 | custom-extract | 2 | exact_match | ↑ | 0.80 | — | — |
| MedQA | YAML | none | 2 | acc | ↑ | 0.93 | 0.844 | 0.853 |