| --- |
| license: apache-2.0 |
| tags: |
| - reasoning |
| - chat |
| - coding |
| - math |
| - science |
| - agent |
| - tools |
| base_model: |
| - OrionLLM/GRM2-3b |
| --- |
| |
| # GRM2 |
|
|
| <p align="center"> |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/685ea8ff7b4139b6845ce395/YF0kEDYMGJhcM3Lbl2EOD.png" alt="logo" width="100"> |
| </p> |
|
|
| <p align="center"> |
| <a href="https://huggingface.co/OrionLLM/GRM2-3b/"> |
| <img src="https://img.shields.io/badge/%F0%9F%A4%97%20HF-OrionLLM%2FGRM2--3b-green" alt="Hugging Face"> |
| </a> |
| <a href="https://huggingface.co/OrionLLM/GRM2-3b-GGUF"> |
| <img src="https://img.shields.io/badge/Quantizations-GRM2--3b--GGUF-blue" alt="Quantizations"> |
| </a> |
| <a href="https://huggingface.co/OrionLLM"> |
| <img src="https://img.shields.io/badge/Research-OrionLLM-purple" alt="Research"> |
| </a> |
| <a href="https://huggingface.co/spaces/DedeProGames/GRM2-Chat"> |
| <img src="https://img.shields.io/badge/Chat%20with%20the%20model-HuggingfaceSpace-5B3DF5?logo=chatbot&logoColor=white" alt="Chat with the model on HuggingfaceSpace"> |
| </a> |
| <a href=""> |
| <img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License"> |
| </a> |
| </p> |
| |
| ## 1. Introduction |
|
|
| GRM2 is a **3B-parameter AI designed for general-purpose, reasoning-focused tasks**, with a strong emphasis on improving **multi-domain reasoning** across code, mathematics, science, and complex knowledge tasks. It is optimized for handling **long chains of thought**, enabling more structured, accurate, and reliable reasoning over difficult problems. |
|
|
| Despite its compact size, the model achieves **strong benchmark performance**, making it an efficient choice for users who want a balance between reasoning quality, versatility, and deployability. |
|
|
| ## 2. Key Capabilities |
|
|
| - **Deep Reasoning at Speed:** GRM2 delivers high performance on reasoning-heavy and complex tasks, with the ability to compete with — and in some cases surpass — much larger 30B-class models. |
| - **A Robust Engine for Coding & Agents:** Despite having only 3B parameters, GRM2 can generate large, consistent code outputs and is an excellent choice for agentic workflows running on personal devices. |
| - **Accessible Local Deployment:** Optimized for accessibility, GRM2 brings elite-level intelligence to local environments, making it a strong option for local inference across a wide range of hardware. |
| - **Efficient Long Context:** The model supports a cost-efficient **256K context window**, enabling long, chronologically consistent chains of reasoning with strong introspective capabilities. |
|
|
| ## 3. Performance |
|
|
| The GRM2 delivers performance equivalent to larger models, while remaining open, small, and efficient. |
|
|
| <p align="center"> |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/685ea8ff7b4139b6845ce395/c3m3C_XipLqjwjbcvsNWy.png" alt="logo" width="1000"> |
| </p> |
|
|
| ### Detailed Benchmarks |
|
|
| | Model | LiveCodeBench v6 | HMMT Nov 25 | GPQA / GPQA Diamond | MultiChallenge | AIME 2026 | xBench-DeepSearch-2510 | BFCL-V4 | |
| |---|---:|---:|---:|---:|---:|---:|---:| |
| | OrionLLM/GRM2-3b | **76.9** | **77.92** | **83.8** | **52.21** | **87.40** | **39.0** | 56.5 | |
| | Qwen/Qwen3-32B | 55.7 | 57.08 | 68.4 | 38.72 | 75.83 | 8 | 47.90 | |
| | OpenAI/o3-mini | 76.4 | N/A | 79.7 | 39.89 | 86.5 | N/A | **65.12** | |