SOTAagi2030
/

ReasoningModel-Best

Feature Extraction

Model card Files Files and versions

ReasoningModel-Best / README.md

SOTAagi2030's picture

Upload folder using huggingface_hub

f452c1a verified about 8 hours ago

|

history blame contribute delete

2.23 kB

	---
	license: apache-2.0
	library_name: transformers
	---
	# ReasoningModel

	<!-- markdownlint-disable first-line-h1 -->
	<!-- markdownlint-disable html -->
	<!-- markdownlint-disable no-duplicate-header -->

	<div align="center">
	<img src="figures/fig1.png" width="60%" alt="ReasoningModel" />
	</div>
	<hr>

	<div align="center" style="line-height: 1;">
	<a href="LICENSE" style="margin: 2px;">
	<img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
	</a>
	</div>

	## 1. Introduction

	ReasoningModel is optimized for complex reasoning tasks. This checkpoint is selected based on the combined performance of math reasoning and logical reasoning benchmarks.

	<p align="center">
	<img width="80%" src="figures/fig3.png">
	</p>

	Compared to general-purpose models, ReasoningModel demonstrates significantly improved performance on tasks requiring multi-step reasoning, mathematical computation, and logical inference.

	## 2. Evaluation Results

	### Comprehensive Benchmark Results

	<div align="center">

	\| \| Benchmark \| ReasonBase \| ReasonPro \| ReasoningModel \|
	\|---\|---\|---\|---\|---\|
	\| Core Reasoning Tasks \| Math Reasoning \| 0.510 \| 0.535 \| 0.550 \|
	\| \| Logical Reasoning \| 0.789 \| 0.801 \| 0.819 \|
	\| \| Common Sense \| 0.716 \| 0.702 \| 0.736 \|
	\| Language Understanding \| Reading Comprehension \| 0.671 \| 0.685 \| 0.700 \|
	\| \| Question Answering \| 0.582 \| 0.599 \| 0.607 \|
	\| \| Text Classification \| 0.803 \| 0.811 \| 0.828 \|
	\| \| Sentiment Analysis \| 0.777 \| 0.781 \| 0.792 \|
	\| Generation Tasks \| Code Generation \| 0.615 \| 0.631 \| 0.650 \|
	\| \| Creative Writing \| 0.588 \| 0.579 \| 0.610 \|
	\| \| Dialogue Generation \| 0.621 \| 0.635 \| 0.644 \|
	\| \| Summarization \| 0.745 \| 0.755 \| 0.767 \|
	\| Specialized Capabilities\| Translation \| 0.782 \| 0.799 \| 0.804 \|
	\| \| Knowledge Retrieval \| 0.651 \| 0.668 \| 0.676 \|
	\| \| Instruction Following \| 0.733 \| 0.749 \| 0.758 \|
	\| \| Safety Evaluation \| 0.718 \| 0.701 \| 0.739 \|

	</div>

	### Reasoning Performance Highlight

	ReasoningModel achieves strong performance on both math reasoning and logical reasoning benchmarks, making it the best choice for reasoning-intensive applications.

	## 3. License
	[Apache-2.0 License](LICENSE)

	## 4. Contact
	Open an issue on GitHub.