File size: 2,391 Bytes
a20d36d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | ---
license: mit
library_name: transformers
---
# LatestModel
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable html -->
<!-- markdownlint-disable no-duplicate-header -->
<div align="center">
<img src="figures/fig1.png" width="60%" alt="LatestModel" />
</div>
<hr>
<div align="center" style="line-height: 1;">
<a href="LICENSE" style="margin: 2px;">
<img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
</a>
</div>
## 1. Introduction
LatestModel represents our most recent release, trained with the latest techniques and data. This checkpoint is the most up-to-date version of our training run.
<p align="center">
<img width="80%" src="figures/fig3.png">
</p>
This model is the final checkpoint from our complete training run, offering the most comprehensive coverage of our training data.
## 2. Evaluation Results
### Comprehensive Benchmark Results
<div align="center">
| | Benchmark | ModelA | ModelB | LatestModel |
|---|---|---|---|---|
| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.550 |
| | Logical Reasoning | 0.789 | 0.801 | 0.819 |
| | Common Sense | 0.716 | 0.702 | 0.736 |
| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.700 |
| | Question Answering | 0.582 | 0.599 | 0.607 |
| | Text Classification | 0.803 | 0.811 | 0.828 |
| | Sentiment Analysis | 0.777 | 0.781 | 0.792 |
| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.650 |
| | Creative Writing | 0.588 | 0.579 | 0.610 |
| | Dialogue Generation | 0.621 | 0.635 | 0.644 |
| | Summarization | 0.745 | 0.755 | 0.767 |
| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.804 |
| | Knowledge Retrieval | 0.651 | 0.668 | 0.676 |
| | Instruction Following | 0.733 | 0.749 | 0.758 |
| | Safety Evaluation | 0.718 | 0.701 | 0.739 |
</div>
### Overall Performance Summary
LatestModel achieves strong results across all evaluated benchmarks as our most recent trained model.
## 3. How to Use
Please refer to our code repository for usage instructions.
### System Prompt
```
You are LatestModel, a helpful AI assistant.
Today is {current date}.
```
### Temperature
We recommend setting the temperature to 0.6.
## 4. License
This repository is licensed under the [MIT License](LICENSE).
## 5. Contact
If you have questions, please raise an issue on our GitHub repository.
|