Added model card
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ tags:
|
|
| 8 |
language:
|
| 9 |
- arz
|
| 10 |
datasets:
|
| 11 |
-
- MBZUAI-Paris/Egyptian-SFT
|
| 12 |
base_model:
|
| 13 |
- google/gemma-3-4b-pt
|
| 14 |
---
|
|
@@ -41,7 +41,6 @@ pip install -U transformers sentencepiece
|
|
| 41 |
Then, copy the snippet from the section below.
|
| 42 |
|
| 43 |
#### Running with the `pipeline` API
|
| 44 |
-
|
| 45 |
```python
|
| 46 |
import torch
|
| 47 |
from transformers import pipeline
|
|
@@ -53,6 +52,9 @@ pipe = pipeline(
|
|
| 53 |
device="cuda" # replace with "mps" to run on a Mac device
|
| 54 |
)
|
| 55 |
|
|
|
|
|
|
|
|
|
|
| 56 |
messages = [
|
| 57 |
{"role": "user", "content": 'اسمك ايه؟'},
|
| 58 |
]
|
|
@@ -61,11 +63,23 @@ outputs = pipe(messages, max_new_tokens=256)
|
|
| 61 |
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
|
| 62 |
print(assistant_response)
|
| 63 |
```
|
| 64 |
-
|
| 65 |
-
- Response:
|
| 66 |
|
| 67 |
>اسمي نايل-شات، على اسم نهر النيل، اطول نهر في العالم، اللي من زمان كان عامل مهم في تطور مصر، وبيساعد في معيشة الناس وأثر على التراث والثقافة بتاعتنا. وعشان انا موديل لغة، الباحثين بتوع جامعة محمد بن زايد للذكاء الاصطناعي دربوني باستخدام مجموعة من المصادر المفتوحة، فدي حاجة خلتني مميز.
|
| 68 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
## Training Data
|
| 70 |
Nile-Chat models were trained on diverse datasets focusing on Egyptian dialect consisting of approximately 3.3B tokens during continual pre-training phase, 1.9M instructions during instruction finetuning and 0.2M samples for DPO, with a maximum length of 2048 tokens, including:
|
| 71 |
|
|
@@ -73,7 +87,7 @@ Nile-Chat models were trained on diverse datasets focusing on Egyptian dialect c
|
|
| 73 |
* Instruction samples created from publicly available Egyptian Arabic datasets including translation and transliteration.
|
| 74 |
* Translated English and multi-lingual pretraining and instruction-tuning datasets using Claude 3.5 Sonnet (v2).
|
| 75 |
|
| 76 |
-
The dataset covers both Egyptian Arabic and Latin scripts. Our instruction tuning dataset [Egyptian-SFT](https://huggingface.co/datasets/MBZUAI-Paris/Egyptian-SFT) is publicly available.
|
| 77 |
|
| 78 |
|
| 79 |
## Implementation Information
|
|
@@ -91,288 +105,202 @@ Nile-Chat models were evaluated on a comprehensive suite of tasks using various
|
|
| 91 |
* **EgyptianWinoGrande:** An Egyptian version of WinoGrande benchmark (In both scripts Arabic and Latin).
|
| 92 |
* **EgyptianRACE:** An Egyptian version of RACE benchmark (In both scripts Arabic and Latin).
|
| 93 |
* **EgyptianOpenBookQA:** An Egyptian version of OpenBookQA benchmark.
|
|
|
|
| 94 |
|
| 95 |
The models were compared against a collection of existing open-source Arabic models to gauge their effectiveness, with a particular focus on performance in Egyptian. All scores are based on zero-shot performance. The prompts are written mainly in Egyptian. We used [Language Model Evaluation Harness](https://github.com/MBZUAI-Paris/lm-evaluation-harness-nile-chat) to conduct these evaluations. All evaluations are done with applying chat template except for EgyptianWinoGrande.
|
| 96 |
|
| 97 |
-
|
|
|
|
| 98 |
<table>
|
|
|
|
| 99 |
<tr>
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
<td ><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianPIQA" target="_blank">EgyptianPIQA</br>(Latin Script)</a></td>
|
| 112 |
-
<td ><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianWinoGrande" target="_blank">EgyptianWinoGrande</br>(Latin Script)</a></td>
|
| 113 |
-
<td ><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE High</br>(Latin Script)</a></td>
|
| 114 |
-
<td ><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE Middle</br>(Latin Script)</a></td>
|
| 115 |
</tr>
|
|
|
|
|
|
|
| 116 |
<tr>
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
<td>38.56</td>
|
| 121 |
-
<td>42.56</td>
|
| 122 |
-
<td>60.32</td>
|
| 123 |
-
<td>56.49</td>
|
| 124 |
-
<td>35.79</td>
|
| 125 |
-
<td>33.68</td>
|
| 126 |
-
<td>40.06</td>
|
| 127 |
-
<td>30.90</td>
|
| 128 |
-
<td>52.76</td>
|
| 129 |
-
<td>48.57</td>
|
| 130 |
-
<td>25.47</td>
|
| 131 |
-
<td>26.94</td>
|
| 132 |
</tr>
|
| 133 |
<tr>
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
<td>57.33</td>
|
| 138 |
-
<td>49.18</td>
|
| 139 |
-
<td>62.23</td>
|
| 140 |
-
<td>57.04</td>
|
| 141 |
-
<td>33.33</td>
|
| 142 |
-
<td>34.72</td>
|
| 143 |
-
<td>37.5</td>
|
| 144 |
-
<td>30.27</td>
|
| 145 |
-
<td>53.25</td>
|
| 146 |
-
<td>52.14</td>
|
| 147 |
-
<td>24.18</td>
|
| 148 |
-
<td>28.06</td>
|
| 149 |
</tr>
|
| 150 |
<tr>
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
<td>55.67</td>
|
| 155 |
-
<td>40.85</td>
|
| 156 |
-
<td>56.5</td>
|
| 157 |
-
<td>54.35</td>
|
| 158 |
-
<td>32.89</td>
|
| 159 |
-
<td>34.62</td>
|
| 160 |
-
<td>42.33</td>
|
| 161 |
-
<td>30.81</td>
|
| 162 |
-
<td>51.67</td>
|
| 163 |
-
<td>50.4</td>
|
| 164 |
-
<td>24.38</td>
|
| 165 |
-
<td>28.06</td>
|
| 166 |
</tr>
|
| 167 |
<tr>
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
<td>64.22</td>
|
| 172 |
-
<td>45.47</td>
|
| 173 |
-
<td>58.02</td>
|
| 174 |
-
<td>56.41</td>
|
| 175 |
-
<td>38.7</td>
|
| 176 |
-
<td>35.45</td>
|
| 177 |
-
<td>41.76</td>
|
| 178 |
-
<td>30.51</td>
|
| 179 |
-
<td>51.88</td>
|
| 180 |
-
<td>50.95</td>
|
| 181 |
-
<td>24.88</td>
|
| 182 |
-
<td>26.11</td>
|
| 183 |
</tr>
|
| 184 |
<tr>
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
<td>67.67</td>
|
| 189 |
-
<td>57.29</td>
|
| 190 |
-
<td>66.1</td>
|
| 191 |
-
<td>62.18</td>
|
| 192 |
-
<td>40.04</td>
|
| 193 |
-
<td>39.5</td>
|
| 194 |
-
<td>45.17</td>
|
| 195 |
-
<td>32.17</td>
|
| 196 |
-
<td>53.09</td>
|
| 197 |
-
<td>50.63</td>
|
| 198 |
-
<td>25.07</td>
|
| 199 |
-
<td>31.94</td>
|
| 200 |
</tr>
|
| 201 |
<tr>
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
|
| 205 |
-
<td>70.67</td>
|
| 206 |
-
<td>50.39</td>
|
| 207 |
-
<td>61.84</td>
|
| 208 |
-
<td>57.2</td>
|
| 209 |
-
<td>36.91</td>
|
| 210 |
-
<td>41.89</td>
|
| 211 |
-
<td>46.02</td>
|
| 212 |
-
<td>30.88</td>
|
| 213 |
-
<td>52.32</td>
|
| 214 |
-
<td>51.43</td>
|
| 215 |
-
<td>25.07</td>
|
| 216 |
-
<td>27.22</td>
|
| 217 |
</tr>
|
| 218 |
-
<!-- <tr style="border-top: 4px solid;"></tr> -->
|
| 219 |
<tr>
|
| 220 |
-
|
| 221 |
-
|
| 222 |
-
|
| 223 |
-
<td>55.89</td>
|
| 224 |
-
<td>43.1</td>
|
| 225 |
-
<td>57.97</td>
|
| 226 |
-
<td>54.27</td>
|
| 227 |
-
<td>35.57</td>
|
| 228 |
-
<td>34.41</td>
|
| 229 |
-
<td>40.34</td>
|
| 230 |
-
<td>31.77</td>
|
| 231 |
-
<td>53.3</td>
|
| 232 |
-
<td>50.24</td>
|
| 233 |
-
<td>24.48</td>
|
| 234 |
-
<td>28.33</td>
|
| 235 |
</tr>
|
| 236 |
<tr>
|
| 237 |
-
|
| 238 |
-
|
| 239 |
-
|
| 240 |
-
<td>73.33</td>
|
| 241 |
-
<td>53.14</td>
|
| 242 |
-
<td>62.5</td>
|
| 243 |
-
<td>58.39</td>
|
| 244 |
-
<td>39.82</td>
|
| 245 |
-
<td>41.06</td>
|
| 246 |
-
<td>47.16</td>
|
| 247 |
-
<td>33.16</td>
|
| 248 |
-
<td>53.8</td>
|
| 249 |
-
<td>50.24</td>
|
| 250 |
-
<td>26.07</td>
|
| 251 |
-
<td>30.56</td>
|
| 252 |
</tr>
|
| 253 |
<tr>
|
| 254 |
-
|
| 255 |
-
|
| 256 |
-
|
| 257 |
-
<td>49.44</td>
|
| 258 |
-
<td>49.53</td>
|
| 259 |
-
<td>61.35</td>
|
| 260 |
-
<td>61.79</td>
|
| 261 |
-
<td>35.79</td>
|
| 262 |
-
<td>40.23</td>
|
| 263 |
-
<td>48.01</td>
|
| 264 |
-
<td>33.75</td>
|
| 265 |
-
<td>53.69</td>
|
| 266 |
-
<td>50.79</td>
|
| 267 |
-
<td>26.66</td>
|
| 268 |
-
<td>28.61</td>
|
| 269 |
</tr>
|
| 270 |
<tr>
|
| 271 |
-
|
| 272 |
-
|
| 273 |
-
|
| 274 |
-
<td>77</td>
|
| 275 |
-
<td>49.49</td>
|
| 276 |
-
<td>64.96</td>
|
| 277 |
-
<td>63.53</td>
|
| 278 |
-
<td>38.03</td>
|
| 279 |
-
<td>41.27</td>
|
| 280 |
-
<td>48.86</td>
|
| 281 |
-
<td>37.52</td>
|
| 282 |
-
<td>53.14</td>
|
| 283 |
-
<td>51.19</td>
|
| 284 |
-
<td>31.02</td>
|
| 285 |
-
<td>35.28</td>
|
| 286 |
</tr>
|
| 287 |
<tr>
|
| 288 |
-
|
| 289 |
-
|
| 290 |
-
|
| 291 |
-
<td>66.33</td>
|
| 292 |
-
<td>52.99</td>
|
| 293 |
-
<td>64.85</td>
|
| 294 |
-
<td>57.91</td>
|
| 295 |
-
<td>36.91</td>
|
| 296 |
-
<td>33.26</td>
|
| 297 |
-
<td>38.64</td>
|
| 298 |
-
<td>30.46</td>
|
| 299 |
-
<td>53.09</td>
|
| 300 |
-
<td>48.18</td>
|
| 301 |
-
<td>25.28</td>
|
| 302 |
-
<td>27.78</td>
|
| 303 |
</tr>
|
| 304 |
<tr>
|
| 305 |
-
|
| 306 |
-
|
| 307 |
-
|
| 308 |
-
<td>65.33</td>
|
| 309 |
-
<td>47.53</td>
|
| 310 |
-
<td>61.3</td>
|
| 311 |
-
<td>56.72</td>
|
| 312 |
-
<td>37.14</td>
|
| 313 |
-
<td>35.45</td>
|
| 314 |
-
<td>41.76</td>
|
| 315 |
-
<td>31.14</td>
|
| 316 |
-
<td>52.87</td>
|
| 317 |
-
<td>50.79</td>
|
| 318 |
-
<td>23.98</td>
|
| 319 |
-
<td>26.11</td>
|
| 320 |
</tr>
|
| 321 |
<tr>
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
<td>72.33</td>
|
| 326 |
-
<td>55.84</td>
|
| 327 |
-
<td>63.97</td>
|
| 328 |
-
<td>59.97</td>
|
| 329 |
-
<td>38.26</td>
|
| 330 |
-
<td>43.25</td>
|
| 331 |
-
<td>50.28</td>
|
| 332 |
-
<td>33.49</td>
|
| 333 |
-
<td>52.87</td>
|
| 334 |
-
<td>53.41</td>
|
| 335 |
-
<td>27.35</td>
|
| 336 |
-
<td>30.28</td>
|
| 337 |
</tr>
|
| 338 |
<tr style="border-top: 4px solid;"></tr>
|
| 339 |
<tr>
|
| 340 |
-
|
| 341 |
-
|
| 342 |
-
|
| 343 |
-
<td>68.56</td>
|
| 344 |
-
<td>55.92</td>
|
| 345 |
-
<td>67.3</td>
|
| 346 |
-
<td>61.87</td>
|
| 347 |
-
<td>40.94</td>
|
| 348 |
-
<td>42.1</td>
|
| 349 |
-
<td>46.02</td>
|
| 350 |
-
<td>50.55</td>
|
| 351 |
-
<td>65.32</td>
|
| 352 |
-
<td>60.62</td>
|
| 353 |
-
<td>37.36</td>
|
| 354 |
-
<td>43.06</td>
|
| 355 |
</tr>
|
| 356 |
<tr>
|
| 357 |
-
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
<td>79.44</td>
|
| 361 |
-
<td>64.04</td>
|
| 362 |
-
<td>70.69</td>
|
| 363 |
-
<td>63.53</td>
|
| 364 |
-
<td>42.06</td>
|
| 365 |
-
<td>48.02</td>
|
| 366 |
-
<td>53.13</td>
|
| 367 |
-
<td>53.71</td>
|
| 368 |
-
<td>65.1</td>
|
| 369 |
-
<td>59.98</td>
|
| 370 |
-
<td>41.72</td>
|
| 371 |
-
<td>48.89</td>
|
| 372 |
</tr>
|
|
|
|
| 373 |
</table>
|
| 374 |
|
| 375 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 376 |
<table>
|
| 377 |
<tr>
|
| 378 |
<td rowspan="2">Model</td>
|
|
@@ -577,6 +505,7 @@ The models were compared against a collection of existing open-source Arabic mod
|
|
| 577 |
|
| 578 |
</table>
|
| 579 |
|
|
|
|
| 580 |
## Usage and Limitations
|
| 581 |
|
| 582 |
These models have certain limitations that users should be aware of.
|
|
@@ -663,4 +592,4 @@ Risks identified and mitigations:
|
|
| 663 |
(Personally Identifiable Information). Developers are encouraged to adhere to
|
| 664 |
privacy regulations with privacy-preserving techniques.
|
| 665 |
|
| 666 |
-
</details>
|
|
|
|
| 8 |
language:
|
| 9 |
- arz
|
| 10 |
datasets:
|
| 11 |
+
- MBZUAI-Paris/Egyptian-SFT-Mixture
|
| 12 |
base_model:
|
| 13 |
- google/gemma-3-4b-pt
|
| 14 |
---
|
|
|
|
| 41 |
Then, copy the snippet from the section below.
|
| 42 |
|
| 43 |
#### Running with the `pipeline` API
|
|
|
|
| 44 |
```python
|
| 45 |
import torch
|
| 46 |
from transformers import pipeline
|
|
|
|
| 52 |
device="cuda" # replace with "mps" to run on a Mac device
|
| 53 |
)
|
| 54 |
|
| 55 |
+
```
|
| 56 |
+
Q1:
|
| 57 |
+
```
|
| 58 |
messages = [
|
| 59 |
{"role": "user", "content": 'اسمك ايه؟'},
|
| 60 |
]
|
|
|
|
| 63 |
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
|
| 64 |
print(assistant_response)
|
| 65 |
```
|
| 66 |
+
A1:
|
|
|
|
| 67 |
|
| 68 |
>اسمي نايل-شات، على اسم نهر النيل، اطول نهر في العالم، اللي من زمان كان عامل مهم في تطور مصر، وبيساعد في معيشة الناس وأثر على التراث والثقافة بتاعتنا. وعشان انا موديل لغة، الباحثين بتوع جامعة محمد بن زايد للذكاء الاصطناعي دربوني باستخدام مجموعة من المصادر المفتوحة، فدي حاجة خلتني مميز.
|
| 69 |
|
| 70 |
+
Q2:
|
| 71 |
+
```python
|
| 72 |
+
messages = [
|
| 73 |
+
{"role": "user", "content": 'Esmak eh?'},
|
| 74 |
+
]
|
| 75 |
+
outputs = pipe(messages, max_new_tokens=256)
|
| 76 |
+
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
|
| 77 |
+
print(assistant_response)
|
| 78 |
+
```
|
| 79 |
+
A2:
|
| 80 |
+
|
| 81 |
+
>Esmi Nile-Chat, 3ala esm nahr el-nil, atwal nahr fel 3alam, elli men zaman kan 3amel mohemm fi tatwor masr, w bir3a el nas, w tb3an el torath
|
| 82 |
+
|
| 83 |
## Training Data
|
| 84 |
Nile-Chat models were trained on diverse datasets focusing on Egyptian dialect consisting of approximately 3.3B tokens during continual pre-training phase, 1.9M instructions during instruction finetuning and 0.2M samples for DPO, with a maximum length of 2048 tokens, including:
|
| 85 |
|
|
|
|
| 87 |
* Instruction samples created from publicly available Egyptian Arabic datasets including translation and transliteration.
|
| 88 |
* Translated English and multi-lingual pretraining and instruction-tuning datasets using Claude 3.5 Sonnet (v2).
|
| 89 |
|
| 90 |
+
The dataset covers both Egyptian Arabic and Latin scripts. Our instruction tuning dataset [Egyptian-SFT-Mixture](https://huggingface.co/datasets/MBZUAI-Paris/Egyptian-SFT-Mixture) is publicly available.
|
| 91 |
|
| 92 |
|
| 93 |
## Implementation Information
|
|
|
|
| 105 |
* **EgyptianWinoGrande:** An Egyptian version of WinoGrande benchmark (In both scripts Arabic and Latin).
|
| 106 |
* **EgyptianRACE:** An Egyptian version of RACE benchmark (In both scripts Arabic and Latin).
|
| 107 |
* **EgyptianOpenBookQA:** An Egyptian version of OpenBookQA benchmark.
|
| 108 |
+
* **EgyptianAlpacaEval:** An Egyptian adaptation of AlpacaEval to assess LLM instruction-following and cultural alignment.
|
| 109 |
|
| 110 |
The models were compared against a collection of existing open-source Arabic models to gauge their effectiveness, with a particular focus on performance in Egyptian. All scores are based on zero-shot performance. The prompts are written mainly in Egyptian. We used [Language Model Evaluation Harness](https://github.com/MBZUAI-Paris/lm-evaluation-harness-nile-chat) to conduct these evaluations. All evaluations are done with applying chat template except for EgyptianWinoGrande.
|
| 111 |
|
| 112 |
+
## Benchmarks:
|
| 113 |
+
### Arabic Script Benchmarks
|
| 114 |
<table>
|
| 115 |
+
<thead>
|
| 116 |
<tr>
|
| 117 |
+
<th><a href="#">Model</a></th>
|
| 118 |
+
<th>Average</th>
|
| 119 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianMMLU_dev" target="_blank">EgyptianMMLU</a></th>
|
| 120 |
+
<th><a href="https://huggingface.co/datasets/facebook/belebele/viewer/ary_Arab" target="_blank">Belebele Arz</a></th>
|
| 121 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianHellaSwag" target="_blank">EgyptianHellaSwag</a></th>
|
| 122 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianPIQA" target="_blank">EgyptianPIQA</a></th>
|
| 123 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianWinoGrande" target="_blank">EgyptianWinoGrande</a></th>
|
| 124 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianOpenBookQA" target="_blank">EgyptianOpenBookQA</a></th>
|
| 125 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE High</a></th>
|
| 126 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE Middle</a></th>
|
| 127 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianAlpacaEval" target="_blank">EgyptianAlpacaEval</a></th>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
</tr>
|
| 129 |
+
</thead>
|
| 130 |
+
<tbody>
|
| 131 |
<tr>
|
| 132 |
+
<td><a href="https://huggingface.co/google/gemma-3-4b-it" target="_blank">gemma-3-4b-it</a></td>
|
| 133 |
+
<td>48.76</td>
|
| 134 |
+
<td>46.08</td><td>38.56</td><td>42.56</td><td>60.32</td><td>56.49</td><td>35.79</td><td>33.68</td><td>40.06</td><td>85.30</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
</tr>
|
| 136 |
<tr>
|
| 137 |
+
<td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
|
| 138 |
+
<td>46.64</td>
|
| 139 |
+
<td>42.60</td><td>57.33</td><td>49.18</td><td>62.23</td><td>57.04</td><td>33.33</td><td>34.72</td><td>37.50</td><td>45.86</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 140 |
</tr>
|
| 141 |
<tr>
|
| 142 |
+
<td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
|
| 143 |
+
<td>42.18</td>
|
| 144 |
+
<td>40.96</td><td>55.67</td><td>40.85</td><td>56.50</td><td>54.35</td><td>32.89</td><td>34.62</td><td>42.33</td><td>21.45</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
</tr>
|
| 146 |
<tr>
|
| 147 |
+
<td><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct" target="_blank">Qwen2.5-7B-Instruct</a></td>
|
| 148 |
+
<td>49.40</td>
|
| 149 |
+
<td>45.74</td><td>64.22</td><td>45.47</td><td>58.02</td><td>56.41</td><td>38.70</td><td>35.45</td><td>41.76</td><td>58.80</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
</tr>
|
| 151 |
<tr>
|
| 152 |
+
<td><a href="https://huggingface.co/ALLaM-AI/ALLaM-7B-Instruct-preview" target="_blank">ALLaM-7B-Instruct-preview</a></td>
|
| 153 |
+
<td>56.40</td>
|
| 154 |
+
<td>60.08</td><td>67.67</td><td>57.29</td><td>66.10</td><td>62.18</td><td>40.04</td><td>39.50</td><td>45.17</td><td>69.55</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 155 |
</tr>
|
| 156 |
<tr>
|
| 157 |
+
<td><a href="https://huggingface.co/CohereLabs/c4ai-command-r7b-arabic-02-2025" target="_blank">c4ai-command-r7b-arabic-02-2025</a></td>
|
| 158 |
+
<td>53.36</td>
|
| 159 |
+
<td>50.97</td><td>70.67</td><td>50.39</td><td>61.84</td><td>57.20</td><td>36.91</td><td>41.89</td><td>46.02</td><td>73.36</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
</tr>
|
|
|
|
| 161 |
<tr>
|
| 162 |
+
<td><a href="https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
|
| 163 |
+
<td>46.31</td>
|
| 164 |
+
<td>42.88</td><td>55.89</td><td>43.10</td><td>57.97</td><td>54.27</td><td>35.57</td><td>34.41</td><td>40.34</td><td>52.35</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 165 |
</tr>
|
| 166 |
<tr>
|
| 167 |
+
<td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-v2-8B-chat" target="_blank">AceGPT-v2-8b-chat</a></td>
|
| 168 |
+
<td>58.33</td>
|
| 169 |
+
<td>55.25</td><td>73.33</td><td>53.14</td><td>62.50</td><td>58.39</td><td>39.82</td><td>41.06</td><td>47.16</td><td>93.33</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 170 |
</tr>
|
| 171 |
<tr>
|
| 172 |
+
<td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
|
| 173 |
+
<td>53.17</td>
|
| 174 |
+
<td>50.72</td><td>49.44</td><td>49.53</td><td>61.35</td><td>61.79</td><td>35.79</td><td>40.23</td><td>48.01</td><td>81.66</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
</tr>
|
| 176 |
<tr>
|
| 177 |
+
<td><a href="https://huggingface.co/google/gemma-3-12b-it" target="_blank">gemma-3-12b-it</a></td>
|
| 178 |
+
<td>59.70</td>
|
| 179 |
+
<td>61.55</td><td>77.00</td><td>49.49</td><td>64.96</td><td>63.53</td><td>38.03</td><td>41.27</td><td>48.86</td><td>92.61</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 180 |
</tr>
|
| 181 |
<tr>
|
| 182 |
+
<td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
|
| 183 |
+
<td>49.81</td>
|
| 184 |
+
<td>44.85</td><td>66.33</td><td>52.99</td><td>64.85</td><td>57.91</td><td>36.91</td><td>33.26</td><td>38.64</td><td>52.52</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 185 |
</tr>
|
| 186 |
<tr>
|
| 187 |
+
<td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
|
| 188 |
+
<td>49.80</td>
|
| 189 |
+
<td>50.03</td><td>65.33</td><td>47.53</td><td>61.30</td><td>56.72</td><td>37.14</td><td>35.45</td><td>41.76</td><td>52.91</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
</tr>
|
| 191 |
<tr>
|
| 192 |
+
<td><a href="https://huggingface.co/Qwen/Qwen2.5-14B-Instruct" target="_blank">Qwen2.5-14B-Instruct</a></td>
|
| 193 |
+
<td>57.34</td>
|
| 194 |
+
<td>60.81</td><td>72.33</td><td>55.84</td><td>63.97</td><td>59.97</td><td>38.26</td><td>43.25</td><td>50.28</td><td>71.35</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 195 |
</tr>
|
| 196 |
<tr style="border-top: 4px solid;"></tr>
|
| 197 |
<tr>
|
| 198 |
+
<td><a href="https://huggingface.co/MBZUAI-Paris/Nile-Chat-4B" target="_blank"><strong>Nile-Chat-4B</strong></a></td>
|
| 199 |
+
<td>57.85</td>
|
| 200 |
+
<td>50.25</td><td>68.56</td><td>55.92</td><td>67.30</td><td>61.87</td><td>40.94</td><td>42.10</td><td>46.02</td><td>87.65</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
</tr>
|
| 202 |
<tr>
|
| 203 |
+
<td><a href="https://huggingface.co/MBZUAI-Paris/Nile-Chat-12B" target="_blank"><strong>Nile-Chat-12B</strong></a></td>
|
| 204 |
+
<td>64.11</td>
|
| 205 |
+
<td>62.59</td><td>79.44</td><td>64.04</td><td>70.69</td><td>63.53</td><td>42.06</td><td>48.02</td><td>53.13</td><td>93.50</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 206 |
</tr>
|
| 207 |
+
</tbody>
|
| 208 |
</table>
|
| 209 |
|
| 210 |
+
### Latin Script Benchmarks
|
| 211 |
+
<table>
|
| 212 |
+
<thead>
|
| 213 |
+
<tr>
|
| 214 |
+
<th><a href="#">Model</a></th>
|
| 215 |
+
<th>Average</th>
|
| 216 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianHellaSwag" target="_blank">EgyptianHellaSwag</a></th>
|
| 217 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianPIQA" target="_blank">EgyptianPIQA</a></th>
|
| 218 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianWinoGrande" target="_blank">EgyptianWinoGrande</a></th>
|
| 219 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE High</a></th>
|
| 220 |
+
<th><a href="https://huggingface.co/datasets/MBZUAI-Paris/EgyptianRACE" target="_blank">EgyptianRACE Middle</a></th>
|
| 221 |
+
</tr>
|
| 222 |
+
</thead>
|
| 223 |
+
<tbody>
|
| 224 |
+
<tr>
|
| 225 |
+
<td><a href="https://huggingface.co/google/gemma-3-4b-it" target="_blank">gemma-3-4b-it</a></td>
|
| 226 |
+
<td>36.93</td>
|
| 227 |
+
<td>30.90</td><td>52.76</td><td>48.57</td><td>25.47</td><td>26.94</td>
|
| 228 |
+
</tr>
|
| 229 |
+
<tr>
|
| 230 |
+
<td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
|
| 231 |
+
<td>37.58</td>
|
| 232 |
+
<td>30.27</td><td>53.25</td><td>52.14</td><td>24.18</td><td>28.06</td>
|
| 233 |
+
</tr>
|
| 234 |
+
<tr>
|
| 235 |
+
<td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
|
| 236 |
+
<td>37.06</td>
|
| 237 |
+
<td>30.81</td><td>51.67</td><td>50.40</td><td>24.38</td><td>28.06</td>
|
| 238 |
+
</tr>
|
| 239 |
+
<tr>
|
| 240 |
+
<td><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct" target="_blank">Qwen2.5-7B-Instruct</a></td>
|
| 241 |
+
<td>36.87</td>
|
| 242 |
+
<td>30.51</td><td>51.88</td><td>50.95</td><td>24.88</td><td>26.11</td>
|
| 243 |
+
</tr>
|
| 244 |
+
<tr>
|
| 245 |
+
<td><a href="https://huggingface.co/ALLaM-AI/ALLaM-7B-Instruct-preview" target="_blank">ALLaM-7B-Instruct-preview</a></td>
|
| 246 |
+
<td>38.58</td>
|
| 247 |
+
<td>32.17</td><td>53.09</td><td>50.63</td><td>25.07</td><td>31.94</td>
|
| 248 |
+
</tr>
|
| 249 |
+
<tr>
|
| 250 |
+
<td><a href="https://huggingface.co/CohereLabs/c4ai-command-r7b-arabic-02-2025" target="_blank">c4ai-command-r7b-arabic-02-2025</a></td>
|
| 251 |
+
<td>37.38</td>
|
| 252 |
+
<td>30.88</td><td>52.32</td><td>51.43</td><td>25.07</td><td>27.22</td>
|
| 253 |
+
</tr>
|
| 254 |
+
<tr>
|
| 255 |
+
<td><a href="https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
|
| 256 |
+
<td>37.62</td>
|
| 257 |
+
<td>31.77</td><td>53.30</td><td>50.24</td><td>24.48</td><td>28.33</td>
|
| 258 |
+
</tr>
|
| 259 |
+
<tr>
|
| 260 |
+
<td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-v2-8B-chat" target="_blank">AceGPT-v2-8b-chat</a></td>
|
| 261 |
+
<td>38.77</td>
|
| 262 |
+
<td>33.16</td><td>53.80</td><td>50.24</td><td>26.07</td><td>30.56</td>
|
| 263 |
+
</tr>
|
| 264 |
+
<tr>
|
| 265 |
+
<td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
|
| 266 |
+
<td>38.70</td>
|
| 267 |
+
<td>33.75</td><td>53.69</td><td>50.79</td><td>26.66</td><td>28.61</td>
|
| 268 |
+
</tr>
|
| 269 |
+
<tr>
|
| 270 |
+
<td><a href="https://huggingface.co/google/gemma-3-12b-it" target="_blank">gemma-3-12b-it</a></td>
|
| 271 |
+
<td>41.63</td>
|
| 272 |
+
<td>37.52</td><td>53.14</td><td>51.19</td><td>31.02</td><td>35.28</td>
|
| 273 |
+
</tr>
|
| 274 |
+
<tr>
|
| 275 |
+
<td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
|
| 276 |
+
<td>36.96</td>
|
| 277 |
+
<td>30.46</td><td>53.09</td><td>48.18</td><td>25.28</td><td>27.78</td>
|
| 278 |
+
</tr>
|
| 279 |
+
<tr>
|
| 280 |
+
<td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
|
| 281 |
+
<td>36.98</td>
|
| 282 |
+
<td>31.14</td><td>52.87</td><td>50.79</td><td>23.98</td><td>26.11</td>
|
| 283 |
+
</tr>
|
| 284 |
+
<tr>
|
| 285 |
+
<td><a href="https://huggingface.co/Qwen/Qwen2.5-14B-Instruct" target="_blank">Qwen2.5-14B-Instruct</a></td>
|
| 286 |
+
<td>39.48</td>
|
| 287 |
+
<td>33.49</td><td>52.87</td><td>53.41</td><td>27.35</td><td>30.28</td>
|
| 288 |
+
</tr>
|
| 289 |
+
<tr style="border-top: 4px solid;"></tr>
|
| 290 |
+
<tr>
|
| 291 |
+
<td><a href="https://huggingface.co/MBZUAI-Paris/Nile-Chat-4B" target="_blank"><strong>Nile-Chat-4B</strong></a></td>
|
| 292 |
+
<td>51.38</td>
|
| 293 |
+
<td>50.55</td><td>65.32</td><td>60.62</td><td>37.36</td><td>43.06</td>
|
| 294 |
+
</tr>
|
| 295 |
+
<tr>
|
| 296 |
+
<td><a href="https://huggingface.co/MBZUAI-Paris/Nile-Chat-12B" target="_blank"><strong>Nile-Chat-12B</strong></a></td>
|
| 297 |
+
<td>53.88</td>
|
| 298 |
+
<td>53.71</td><td>65.10</td><td>59.98</td><td>41.72</td><td>48.89</td>
|
| 299 |
+
</tr>
|
| 300 |
+
</tbody>
|
| 301 |
+
</table>
|
| 302 |
+
|
| 303 |
+
### Translation and Transliteration Tasks:
|
| 304 |
<table>
|
| 305 |
<tr>
|
| 306 |
<td rowspan="2">Model</td>
|
|
|
|
| 505 |
|
| 506 |
</table>
|
| 507 |
|
| 508 |
+
|
| 509 |
## Usage and Limitations
|
| 510 |
|
| 511 |
These models have certain limitations that users should be aware of.
|
|
|
|
| 592 |
(Personally Identifiable Information). Developers are encouraged to adhere to
|
| 593 |
privacy regulations with privacy-preserving techniques.
|
| 594 |
|
| 595 |
+
</details>
|