File size: 1,911 Bytes
43e1bc2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
license: apache-2.0
library_name: transformers
---
# AssistantModel

<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable html -->
<!-- markdownlint-disable no-duplicate-header -->

<div align="center">
  <img src="figures/fig1.png" width="60%" alt="AssistantModel" />
</div>
<hr>

<div align="center" style="line-height: 1;">
  <a href="LICENSE" style="margin: 2px;">
    <img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
  </a>
</div>

## 1. Introduction

AssistantModel is designed for interactive assistant applications. This checkpoint is selected based on the combined performance of knowledge retrieval and instruction following benchmarks, making it ideal for AI assistant deployment.

<p align="center">
  <img width="80%" src="figures/fig3.png">
</p>

## 2. Evaluation Results

### Comprehensive Benchmark Results

<div align="center">

| | Benchmark | Assistant-v1 | Assistant-v2 | AssistantModel |
|---|---|---|---|---|
| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.606 |
| | Logical Reasoning | 0.789 | 0.801 | 0.871 |
| | Common Sense | 0.716 | 0.702 | 0.789 |
| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.759 |
| | Question Answering | 0.582 | 0.599 | 0.678 |
| | Text Classification | 0.803 | 0.811 | 0.859 |
| | Sentiment Analysis | 0.777 | 0.781 | 0.831 |
| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.679 |
| | Creative Writing | 0.588 | 0.579 | 0.634 |
| | Dialogue Generation | 0.621 | 0.635 | 0.684 |
| | Summarization | 0.745 | 0.755 | 0.800 |
| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.843 |
| | Knowledge Retrieval | 0.651 | 0.668 | 0.752 |
| | Instruction Following | 0.733 | 0.749 | 0.835 |
| | Safety Evaluation | 0.718 | 0.701 | 0.767 |

</div>

## 3. License
[Apache-2.0 License](LICENSE)

## 4. Contact
Open an issue on GitHub.