ReasoningModel

1. Introduction

ReasoningModel is optimized for complex reasoning tasks. This checkpoint is selected based on the combined performance of math reasoning and logical reasoning benchmarks.

Compared to general-purpose models, ReasoningModel demonstrates significantly improved performance on tasks requiring multi-step reasoning, mathematical computation, and logical inference.

2. Evaluation Results

Comprehensive Benchmark Results

	Benchmark	ReasonBase	ReasonPro	ReasoningModel
Core Reasoning Tasks	Math Reasoning	0.510	0.535	0.550
	Logical Reasoning	0.789	0.801	0.819
	Common Sense	0.716	0.702	0.736
Language Understanding	Reading Comprehension	0.671	0.685	0.700
	Question Answering	0.582	0.599	0.607
	Text Classification	0.803	0.811	0.828
	Sentiment Analysis	0.777	0.781	0.792
Generation Tasks	Code Generation	0.615	0.631	0.650
	Creative Writing	0.588	0.579	0.610
	Dialogue Generation	0.621	0.635	0.644
	Summarization	0.745	0.755	0.767
Specialized Capabilities	Translation	0.782	0.799	0.804
	Knowledge Retrieval	0.651	0.668	0.676
	Instruction Following	0.733	0.749	0.758
	Safety Evaluation	0.718	0.701	0.739

Reasoning Performance Highlight

ReasoningModel achieves strong performance on both math reasoning and logical reasoning benchmarks, making it the best choice for reasoning-intensive applications.

3. License

Apache-2.0 License

4. Contact

Open an issue on GitHub.

Downloads last month: -