logo

GRM-1.5b is a general-purpose reasoning-focused 1.5B model fine-tuned to improve multi-domain reasoning (math, logic, coding, and broad problem-solving). It is designed to be a strong, lightweight “daily driver” for general reasoning tasks and as a solid base for further fine-tuning.


Key features

  • Dedicated reasoning behavior for general tasks (stepwise problem solving, better consistency).
  • Small & efficient (1.5B) — practical for local inference and experimentation.
  • Multi-domain mixture: reasoning + code + math + (some) medical reasoning data.
  • Fine-tune friendly: intended as a good starting point for your own SFT/GRPO/DPO pipelines.

Benchmarks

Model AIME24 AIME25 AMC23 MATH500 HMMT O2/25 LCB 06/24-01/25 CodeElo CodeForces GPQA-D JEEBench
GRM-1.5b 52.0 41.7 87.0 86.4 27.3 39.4 12.9 15.5 29.5 51.9
DeepSeek-R1-Distill-Qwen-1.5B 32.3 23.7 71.8 80.8 15.3 27.2 8.8 8.5 31.1 32.5
Nemotron-Research-Reasoning-Qwen-1.5B 47.7 32.0 87.5 86.0 21.7 31.4 54.7 40.3 41.8 52.6
Qwen3-1.7B 52.0 35.3 83.8 87.2 23.3 27.7 20.7 20.0 49.3 60.7
Qwen2.5-1.5B-Instruct 3.0 0.7 30.8 50.2 0.0 5.5 0.8 2.2 24.7 16.4
Downloads last month
48
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OrionLLM/GRM-1.5b

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1400)
this model
Quantizations
3 models

Datasets used to train OrionLLM/GRM-1.5b

Collection including OrionLLM/GRM-1.5b