ReasoningModel

ReasoningModel

1. Introduction

ReasoningModel is optimized for complex reasoning tasks. This checkpoint is selected based on the combined performance of math reasoning and logical reasoning benchmarks.

Compared to general-purpose models, ReasoningModel demonstrates significantly improved performance on tasks requiring multi-step reasoning, mathematical computation, and logical inference.

2. Evaluation Results

Comprehensive Benchmark Results

Benchmark ReasonBase ReasonPro ReasoningModel
Core Reasoning Tasks Math Reasoning 0.510 0.535 0.550
Logical Reasoning 0.789 0.801 0.819
Common Sense 0.716 0.702 0.736
Language Understanding Reading Comprehension 0.671 0.685 0.700
Question Answering 0.582 0.599 0.607
Text Classification 0.803 0.811 0.828
Sentiment Analysis 0.777 0.781 0.792
Generation Tasks Code Generation 0.615 0.631 0.650
Creative Writing 0.588 0.579 0.610
Dialogue Generation 0.621 0.635 0.644
Summarization 0.745 0.755 0.767
Specialized Capabilities Translation 0.782 0.799 0.804
Knowledge Retrieval 0.651 0.668 0.676
Instruction Following 0.733 0.749 0.758
Safety Evaluation 0.718 0.701 0.739

Reasoning Performance Highlight

ReasoningModel achieves strong performance on both math reasoning and logical reasoning benchmarks, making it the best choice for reasoning-intensive applications.

3. License

Apache-2.0 License

4. Contact

Open an issue on GitHub.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support