Qwen2.5-7B-ReasonMed-cot-123k

Full fine-tune of Qwen/Qwen2.5-7B on the CoTMed variant of ReasonMed — chain-of-thought reasoning followed by the answer, without <think> tags. 123K samples (1/3 of full dataset) × 3 epochs.

Training code: https://github.com/Chen-Jie7/NLP_project

Training data

lingshu-medical-mllm/ReasonMed — CoTMed.json variant. Outputs are free-form CoT reasoning concluding with the answer.

Three-way format comparison

All three models trained with identical hyperparameters on 123K samples. Evaluated via loglikelihood MCQ scoring.

Variant	Output format	Total acc
reason	`<think>CoT</think>Response`	65.8
cot (this model)	CoT without tags	65.0
response	Direct answer only	63.8

Evaluation

Benchmark	Ours (123K)	Paper (370K)
MedQA	61.0	66.9
MedMCQA (val)	60.4	65.1
PubMedQA	75.9	82.0
MMLU-Anatomy	74.8	75.6
MMLU-Clinical-Knowledge	78.1	79.3
MMLU-College-Biology	81.9	79.2
MMLU-College-Medicine	68.8	73.4
MMLU-Medical-Genetics	84.0	85.0
MMLU-Professional-Medicine	79.0	80.9
Total	65.0	69.6

Training hyperparameters

learning_rate: 1e-05
effective batch size: 128 (8 GPU × 4 per-device × 4 accum)
num_epochs: 3.0
lr_scheduler: cosine, warmup_ratio 0.1
precision: bf16
deepspeed: ZeRO stage 2
hardware: 8× H200, ~6.5h

Framework versions

Transformers 4.57.6
Pytorch 2.10.0+cu128
LlamaFactory 0.9.5

Downloads last month: 4

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for Chens7/Qwen2.5-7B-ReasonMed-cot-123k

Base model

Qwen/Qwen2.5-7B

Finetuned

(913)

this model

Chens7
/

Qwen2.5-7B-ReasonMed-cot-123k