These are the GRM2 quantizations; if you want the safetensors, please visit the model page: OrionLLM/GRM2-3b.
A powerful 3B general-purpose reasoning model built for strong multi-domain performance and long-chain reasoning.
GRM2 is a 3B-parameter AI designed for general-purpose, reasoning-focused tasks, with a strong emphasis on improving multi-domain reasoning across code, mathematics, science, and complex knowledge tasks. It is optimized for handling long chains of thought, enabling more structured, accurate, and reliable reasoning over difficult problems.
Despite its compact size, the model achieves strong benchmark performance, making it an efficient choice for users who want a balance between reasoning quality, versatility, and deployability.
Key Features
- 3B parameters for an efficient balance of capability and usability
- General-purpose reasoning across multiple domains
- Strong performance in code, math, science, and knowledge-intensive tasks
- Optimized for long-chain reasoning and complex problem solving
- Designed to deliver competitive benchmark results despite its small size
Benchmarks
| Model | LiveCodeBench v6 | HMMT Nov 25 | GPQA / GPQA Diamond | MultiChallenge | AIME 2026 | xBench-DeepSearch-2510 | BFCL-V4 |
|---|---|---|---|---|---|---|---|
| OrionLLM/GRM2-3b | 76.9 | 77.92 | 83.8 | 52.21 | 87.40 | 39.0 | 56.5 |
| Qwen/Qwen3-32B | 55.7 | 57.08 | 68.4 | 38.72 | 75.83 | 8 | 47.90 |
| Qwen/Qwen3.5-4B | 55.8 | 76.8 | 76.2 | 49.0 | N/A | N/A | 50.3 |
| Qwen/Qwen3-8B | 49.4 | 48.33 | 62.0 | 36.30 | 70.42 | 2 | 42.20 |
- Downloads last month
- -