| | --- |
| | license: apache-2.0 |
| | library_name: transformers |
| | --- |
| | # AssistantModel |
| |
|
| | <!-- markdownlint-disable first-line-h1 --> |
| | <!-- markdownlint-disable html --> |
| | <!-- markdownlint-disable no-duplicate-header --> |
| |
|
| | <div align="center"> |
| | <img src="figures/fig1.png" width="60%" alt="AssistantModel" /> |
| | </div> |
| | <hr> |
| |
|
| | <div align="center" style="line-height: 1;"> |
| | <a href="LICENSE" style="margin: 2px;"> |
| | <img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/> |
| | </a> |
| | </div> |
| | |
| | ## 1. Introduction |
| |
|
| | AssistantModel is designed for interactive assistant applications. This checkpoint is selected based on the combined performance of knowledge retrieval and instruction following benchmarks, making it ideal for AI assistant deployment. |
| |
|
| | <p align="center"> |
| | <img width="80%" src="figures/fig3.png"> |
| | </p> |
| |
|
| | ## 2. Evaluation Results |
| |
|
| | ### Comprehensive Benchmark Results |
| |
|
| | <div align="center"> |
| |
|
| | | | Benchmark | Assistant-v1 | Assistant-v2 | AssistantModel | |
| | |---|---|---|---|---| |
| | | **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.606 | |
| | | | Logical Reasoning | 0.789 | 0.801 | 0.871 | |
| | | | Common Sense | 0.716 | 0.702 | 0.789 | |
| | | **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.759 | |
| | | | Question Answering | 0.582 | 0.599 | 0.678 | |
| | | | Text Classification | 0.803 | 0.811 | 0.859 | |
| | | | Sentiment Analysis | 0.777 | 0.781 | 0.831 | |
| | | **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.679 | |
| | | | Creative Writing | 0.588 | 0.579 | 0.634 | |
| | | | Dialogue Generation | 0.621 | 0.635 | 0.684 | |
| | | | Summarization | 0.745 | 0.755 | 0.800 | |
| | | **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.843 | |
| | | | Knowledge Retrieval | 0.651 | 0.668 | 0.752 | |
| | | | Instruction Following | 0.733 | 0.749 | 0.835 | |
| | | | Safety Evaluation | 0.718 | 0.701 | 0.767 | |
| |
|
| | </div> |
| |
|
| | ## 3. License |
| | [Apache-2.0 License](LICENSE) |
| |
|
| | ## 4. Contact |
| | Open an issue on GitHub. |
| |
|