Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -18,8 +18,6 @@ library_name: transformers
|
|
| 18 |
</a>
|
| 19 |
</div>
|
| 20 |
|
| 21 |
-
Best checkpoint: step_1000
|
| 22 |
-
|
| 23 |
## 1. Introduction
|
| 24 |
|
| 25 |
The MyAwesomeModel has undergone a significant version upgrade. In the latest update, MyAwesomeModel has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of other leading models.
|
|
@@ -34,27 +32,29 @@ Beyond its improved reasoning capabilities, this version also offers a reduced h
|
|
| 34 |
|
| 35 |
## 2. Evaluation Results
|
| 36 |
|
|
|
|
|
|
|
| 37 |
### Comprehensive Benchmark Results
|
| 38 |
|
| 39 |
<div align="center">
|
| 40 |
|
| 41 |
| | Benchmark | Model1 | Model2 | Model1-v2 | MyAwesomeModel |
|
| 42 |
|---|---|---|---|---|---|
|
| 43 |
-
| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.521 | 0.
|
| 44 |
-
| | Logical Reasoning | 0.789 | 0.801 | 0.810 | 0.
|
| 45 |
-
| | Common Sense | 0.716 | 0.702 | 0.725 | 0.
|
| 46 |
-
| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.690 | 0.
|
| 47 |
-
| | Question Answering | 0.582 | 0.599 | 0.601 | 0.
|
| 48 |
-
| | Text Classification | 0.803 | 0.811 | 0.820 | 0.
|
| 49 |
-
| | Sentiment Analysis | 0.777 | 0.781 | 0.790 | 0.
|
| 50 |
-
| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.640 | 0.
|
| 51 |
-
| | Creative Writing | 0.588 | 0.579 | 0.601 | 0.
|
| 52 |
-
| | Dialogue Generation | 0.621 | 0.635 | 0.639 | 0.
|
| 53 |
-
| | Summarization | 0.745 | 0.755 | 0.760 | 0.
|
| 54 |
-
| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.801 | 0.
|
| 55 |
-
| | Knowledge Retrieval | 0.651 | 0.668 | 0.670 | 0.
|
| 56 |
-
| | Instruction Following | 0.733 | 0.749 | 0.751 | 0.
|
| 57 |
-
| | Safety Evaluation | 0.718 | 0.701 | 0.725 | 0.
|
| 58 |
|
| 59 |
</div>
|
| 60 |
|
|
@@ -124,4 +124,4 @@ This code repository is licensed under the [MIT License](LICENSE). The use of My
|
|
| 124 |
|
| 125 |
## 6. Contact
|
| 126 |
If you have any questions, please raise an issue on our GitHub repository or contact us at contact@MyAwesomeModel.ai.
|
| 127 |
-
```
|
|
|
|
| 18 |
</a>
|
| 19 |
</div>
|
| 20 |
|
|
|
|
|
|
|
| 21 |
## 1. Introduction
|
| 22 |
|
| 23 |
The MyAwesomeModel has undergone a significant version upgrade. In the latest update, MyAwesomeModel has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of other leading models.
|
|
|
|
| 32 |
|
| 33 |
## 2. Evaluation Results
|
| 34 |
|
| 35 |
+
Best checkpoint: step_400
|
| 36 |
+
|
| 37 |
### Comprehensive Benchmark Results
|
| 38 |
|
| 39 |
<div align="center">
|
| 40 |
|
| 41 |
| | Benchmark | Model1 | Model2 | Model1-v2 | MyAwesomeModel |
|
| 42 |
|---|---|---|---|---|---|
|
| 43 |
+
| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.521 | 0.659894 |
|
| 44 |
+
| | Logical Reasoning | 0.789 | 0.801 | 0.810 | 0.740000 |
|
| 45 |
+
| | Common Sense | 0.716 | 0.702 | 0.725 | 0.705000 |
|
| 46 |
+
| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.690 | 0.680000 |
|
| 47 |
+
| | Question Answering | 0.582 | 0.599 | 0.601 | 0.602000 |
|
| 48 |
+
| | Text Classification | 0.803 | 0.811 | 0.820 | 0.826000 |
|
| 49 |
+
| | Sentiment Analysis | 0.777 | 0.781 | 0.790 | 0.788000 |
|
| 50 |
+
| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.640 | 0.649000 |
|
| 51 |
+
| | Creative Writing | 0.588 | 0.579 | 0.601 | 0.589000 |
|
| 52 |
+
| | Dialogue Generation | 0.621 | 0.635 | 0.639 | 0.642000 |
|
| 53 |
+
| | Summarization | 0.745 | 0.755 | 0.760 | 0.756000 |
|
| 54 |
+
| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.801 | 0.802000 |
|
| 55 |
+
| | Knowledge Retrieval | 0.651 | 0.668 | 0.670 | 0.669000 |
|
| 56 |
+
| | Instruction Following | 0.733 | 0.749 | 0.751 | 0.745000 |
|
| 57 |
+
| | Safety Evaluation | 0.718 | 0.701 | 0.725 | 0.727000 |
|
| 58 |
|
| 59 |
</div>
|
| 60 |
|
|
|
|
| 124 |
|
| 125 |
## 6. Contact
|
| 126 |
If you have any questions, please raise an issue on our GitHub repository or contact us at contact@MyAwesomeModel.ai.
|
| 127 |
+
```
|