some updates
Browse files
README.md
CHANGED
|
@@ -67,22 +67,12 @@ model-index:
|
|
| 67 |
|
| 68 |
## 🏆 UzLiB Benchmark Performance
|
| 69 |
|
| 70 |
-
NeuronAI-Uzbek achieves exceptional performance on the [UzLiB Benchmark](https://github.com/tahrirchi/uzlib), the comprehensive evaluation suite for Uzbek language understanding.
|
| 71 |
|
| 72 |
### Leaderboard Position
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
| 1 | Gemini 3 Pro Preview | Google | 0.826 | 0.822 | 0.864 | 0.875 | 0.731 |
|
| 77 |
-
| 2 | Gemini 3 Flash Preview | Google | 0.795 | 0.794 | 0.852 | 0.708 | 0.692 |
|
| 78 |
-
| 3 | Gemini 2.5 Pro | Google | 0.691 | 0.680 | 0.763 | 0.778 | 0.558 |
|
| 79 |
-
| **4** | **NeuronAI-Uzbek (4B)** | **NeuronAI** | **0.662** | **0.718** | **0.466** | **0.333** | **0.385** |
|
| 80 |
-
| 5 | Claude 3.7 Sonnet | Anthropic | 0.651 | 0.643 | 0.725 | 0.708 | 0.481 |
|
| 81 |
-
| 6 | Claude 3.5 Sonnet | Anthropic | 0.636 | 0.644 | 0.598 | 0.722 | 0.462 |
|
| 82 |
-
| 7 | GPT-4o | OpenAI | 0.632 | 0.638 | 0.606 | 0.653 | 0.558 |
|
| 83 |
-
| 8 | Gemini 2.5 Flash | Google | 0.626 | 0.641 | 0.555 | 0.639 | 0.481 |
|
| 84 |
-
| 9 | GPT-5 | OpenAI | 0.616 | 0.632 | 0.576 | 0.542 | 0.423 |
|
| 85 |
-
| - | Human Voters* | - | 0.589 | 0.605 | 0.525 | 0.525 | 0.509 |
|
| 86 |
|
| 87 |
> **Note**: NeuronAI-Uzbek is the **smallest model** in the top 10, with only **4B parameters**, while competing against models with 100B+ parameters.
|
| 88 |
|
|
|
|
| 67 |
|
| 68 |
## 🏆 UzLiB Benchmark Performance
|
| 69 |
|
| 70 |
+
NeuronAI-Uzbek achieves exceptional performance on the [UzLiB Benchmark](https://github.com/tahrirchi/uzlib/blob/main/LEADERBOARD.md), the comprehensive evaluation suite for Uzbek language understanding.
|
| 71 |
|
| 72 |
### Leaderboard Position
|
| 73 |
|
| 74 |
+
[](https://github.com/tahrirchi/uzlib/blob/main/LEADERBOARD.md)
|
| 75 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
> **Note**: NeuronAI-Uzbek is the **smallest model** in the top 10, with only **4B parameters**, while competing against models with 100B+ parameters.
|
| 78 |
|