MultiLangModel
1. Introduction
MultiLangModel excels at translation and multilingual tasks. This checkpoint is selected based on the best translation benchmark score.
2. Evaluation Results
Comprehensive Benchmark Results
| Benchmark | MLModel-v1 | MLModel-v2 | MultiLangModel | |
|---|---|---|---|---|
| Core Reasoning Tasks | Math Reasoning | 0.510 | 0.535 | 0.550 |
| Logical Reasoning | 0.789 | 0.801 | 0.736 | |
| Common Sense | 0.716 | 0.702 | 0.700 | |
| Language Understanding | Reading Comprehension | 0.671 | 0.685 | 0.644 |
| Question Answering | 0.582 | 0.599 | 0.819 | |
| Text Classification | 0.803 | 0.811 | 0.792 | |
| Sentiment Analysis | 0.777 | 0.781 | 0.607 | |
| Generation Tasks | Code Generation | 0.615 | 0.631 | 0.828 |
| Creative Writing | 0.588 | 0.579 | 0.758 | |
| Dialogue Generation | 0.621 | 0.635 | 0.767 | |
| Summarization | 0.745 | 0.755 | 0.804 | |
| Specialized Capabilities | Translation | 0.782 | 0.799 | 0.804 |
| Knowledge Retrieval | 0.651 | 0.668 | 0.610 | |
| Instruction Following | 0.733 | 0.749 | 0.758 | |
| Safety Evaluation | 0.718 | 0.701 | 0.739 |
Overall Performance Summary
MultiLangModel achieves top performance on translation tasks while maintaining strong results across all other benchmarks.
3. License
4. Contact
Open an issue on GitHub.
- Downloads last month
- -