Info

AKmUU

Demo Screenshot

Qwen2.5-0.5B-Medical-ReasonMed370K

A 0.5 billion parameter medical reasoning model fine-tuned on the complete ReasonMed 370K dataset. This model is built on top of Qwen2.5-0.5B-Instruct and trained to perform structured clinical reasoning, differential diagnosis, and evidence-based medical question answering.

Model Details

  • Base Model: unsloth/Qwen2.5-0.5B-Instruct
  • Model Size: 0.5B parameters
  • Fine-tuning Method: LoRA via Unsloth
  • Training Dataset: ReasonMed 370K (full dataset)
  • Training Hardware: NVIDIA Tesla T4 (Kaggle free tier)
  • License: Apache 2.0

Training Details

The model was fine-tuned in two stages, each covering half of the ReasonMed dataset:

Stage 1: Fine-tuned on the first 185,000 samples of ReasonMed using LoRA with the following configuration:

  • LoRA rank: 8
  • LoRA alpha: 16
  • Learning rate: 5e-5
  • Batch size: 2 with 16 gradient accumulation steps
  • Max sequence length: 4096
  • Epochs: 1
  • Optimizer: AdamW 8-bit

Stage 2: Continued fine-tuning on the remaining 184,983 samples with identical configuration, completing one full pass over the entire 370K dataset.

Both stages used packing=False to ensure every sample was processed individually without truncation.

Dataset

This model was trained on ReasonMed, the largest open-source medical reasoning dataset available, comprising 370,000 high-quality examples distilled from 1.75 million initial reasoning paths generated by multiple large language models.

ReasonMed is built through a multi-agent verification and refinement pipeline that includes an Error Refiner to correct error-prone reasoning steps. Each example combines detailed chain-of-thought reasoning with a concise answer summary, covering a wide range of medical topics including clinical reasoning, differential diagnosis, pharmacology, and medical question answering.

For more details on the dataset, refer to the official repository: https://github.com/alibaba-damo-academy/ReasonMed

What the Model Can Do

After training on the full ReasonMed dataset, the model demonstrates the ability to:

  • Work through clinical presentations step by step
  • Generate differential diagnoses with reasoning for each option
  • Rule out unlikely diagnoses with justification
  • Provide structured final answers with clinical pearls
  • Reason through medical multiple choice questions with explanation

Demo

The screenshot above shows the model running through a clinical scenario involving hypothyroidism, demonstrating its ability to identify key symptoms, interpret lab values, and produce a structured response with management guidance.

Limitations

  • This is a 0.5B parameter model and has a hard ceiling on reasoning depth and factual recall
  • Small models are prone to inconsistency across similar questions
  • The model may occasionally hallucinate clinical details
  • This model is intended for research and educational purposes only
  • It should not be used for real clinical decision making or as a substitute for a qualified medical professional

Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name     = "Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K",
    max_seq_length = 4096,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

messages = [
    {"role": "user", "content": "Your medical question here"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize              = True,
    add_generation_prompt = True,
    return_tensors        = "pt"
).to("cuda")

outputs = model.generate(
    input_ids            = inputs,
    max_new_tokens       = 1024,
    temperature          = 0.7,
    do_sample            = True,
    repetition_penalty   = 1.3,
    no_repeat_ngram_size = 3,
    top_p                = 0.9,
    top_k                = 50,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

If you use this model, please cite the ReasonMed dataset:

@misc{sun2025reasonmed370kmultiagentgenerated,
      title={ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning}, 
      author={Yu Sun and Xingyu Qian and Weiwen Xu and Hao Zhang and Chenghao Xiao and Long Li and Yu Rong and Wenbing Huang and Qifeng Bai and Tingyang Xu},
      year={2025},
      eprint={2506.09513},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.09513}, 
}

Acknowledgements

Training was conducted on Kaggle free tier infrastructure using Unsloth for efficient fine-tuning. The ReasonMed dataset was created by the team at Alibaba DAMO Academy and Tencent AI Lab.

Downloads last month
317
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K

Finetuned
(647)
this model

Dataset used to train Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K

Paper for Rumiii/Qwen2.5-0.5B-Medical-ReasonMed370K