MultiLangModel

MultiLangModel

1. Introduction

MultiLangModel excels at translation and multilingual tasks. This checkpoint is selected based on the best translation benchmark score.

2. Evaluation Results

Comprehensive Benchmark Results

Benchmark MLModel-v1 MLModel-v2 MultiLangModel
Core Reasoning Tasks Math Reasoning 0.510 0.535 0.550
Logical Reasoning 0.789 0.801 0.736
Common Sense 0.716 0.702 0.700
Language Understanding Reading Comprehension 0.671 0.685 0.644
Question Answering 0.582 0.599 0.819
Text Classification 0.803 0.811 0.792
Sentiment Analysis 0.777 0.781 0.607
Generation Tasks Code Generation 0.615 0.631 0.828
Creative Writing 0.588 0.579 0.758
Dialogue Generation 0.621 0.635 0.767
Summarization 0.745 0.755 0.804
Specialized Capabilities Translation 0.782 0.799 0.804
Knowledge Retrieval 0.651 0.668 0.610
Instruction Following 0.733 0.749 0.758
Safety Evaluation 0.718 0.701 0.739

Overall Performance Summary

MultiLangModel achieves top performance on translation tasks while maintaining strong results across all other benchmarks.

3. License

MIT License

4. Contact

Open an issue on GitHub.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support