File size: 6,367 Bytes
c601505
 
 
 
 
 
 
 
 
a0126ed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40412e8
a0126ed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40412e8
a0126ed
 
 
 
 
 
 
 
40412e8
 
 
 
 
a0126ed
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
license: cc-by-nc-4.0
datasets:
- hammh0a/Hala-4.6M-SFT
language:
- ar
base_model:
- QCRI/Fanar-1-9B-Instruct
pipeline_tag: text-generation
---

# Hala: Arabic‑Centric Instruction & Translation Models

<p align="center">
  <img src="https://i.ibb.co/pvhp1XfJ/halalogo.png" alt="Hala logo" width="550" />
</p>

**Paper**: *Hala Technical Report: Building Arabic‑Centric Instruction & Translation Models at Scale*

**Authors**: Hasan Abed Al Kader Hammoud\*, Mohammad Zbeeb\*, Bernard Ghanem

**Affiliation**: King Abdullah University of Science and Technology (KAUST)

\*Equal contribution

> In Arabic, **حلا** (Hala) conveys sweetness and beauty—qualities long associated with the language itself. In this spirit, we call our models **Hala**.

---

## 🔗 Quick Links

* **Models & Data (Hugging Face collection)**: [https://huggingface.co/collections/hammh0a/hala-68bf02b34a14b9f22305ab3a](https://huggingface.co/collections/hammh0a/hala-68bf02b34a14b9f22305ab3a)
* **Contact**: [hasanabedalkader.hammoud@kaust.edu.sa](mailto:hasanabedalkader.hammoud@kaust.edu.sa)

---

## Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "hammh0a/Hala-9B"  # pick a released Hala model

tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype="auto", device_map="auto"
)

# Use chat template
messages = [
    {"role": "system", "content": "أنت مساعد خبير في الفيزياء."},
    {"role": "user", "content": "اشرح بإيجاز مبدأ الانحفاظ في الفيزياء، وأعطني مثالاً يومياً."},
]

prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

pipe = pipeline("text-generation", model=model, tokenizer=tok)
out = pipe(prompt, max_new_tokens=256, do_sample=False)

print(out[0]["generated_text"])
```

---

## 📊 Results

*Hala models are placed at the end of each size category; best **Average** per category is in bold.*

### ≤2B parameters

| Size | Model Name                             | Params | AlGhafa | ArabicMMLU | EXAMS | MadinahQA | AraTrust | ArbMMLU‑HT |  Average |
| ---- | -------------------------------------- | -----: | ------: | ---------: | ----: | --------: | -------: | ---------: | -------: |
| ≤2B  | meta-llama/Llama-3.2-1B                |     1B |    33.9 |       26.5 |  21.2 |      25.7 |     37.1 |       23.9 |     28.0 |
| ≤2B  | Qwen/Qwen2-1.5B-Instruct               |   1.5B |    53.1 |       49.2 |  35.2 |      45.5 |     68.9 |       37.4 |     48.2 |
| ≤2B  | Qwen/Qwen2.5-1.5B-Instruct             |   1.5B |    48.4 |       43.5 |  31.8 |      38.2 |     70.8 |       35.9 |     44.8 |
| ≤2B  | Sakalti/Saka-1.5B                      |   1.5B |    51.4 |       40.0 |  31.3 |      31.5 |     47.5 |       33.5 |     39.2 |
| ≤2B  | Qwen/Qwen3-1.7B-Base                   |   1.7B |    56.8 |       49.7 |  38.2 |      40.0 |     75.6 |       43.9 |     50.7 |
| ≤2B  | Qwen/Qwen1.5-1.8B                      |   1.8B |    32.7 |       26.7 |  23.8 |      26.0 |     31.5 |       23.6 |     27.4 |
| ≤2B  | silma-ai/SILMA-Kashif-2B-Instruct-v1.0 |     2B |    59.7 |       45.6 |  33.1 |      38.8 |     73.3 |       35.8 |     47.7 |
| ≤2B  | google/gemma-2-2b-it                   |     2B |    34.1 |       30.1 |  23.6 |      20.1 |     31.2 |       23.4 |     27.1 |
| ≤2B  | LiquidAI/LFM2-350M                     |   350M |    39.0 |       35.2 |  30.9 |      28.3 |     43.3 |       29.1 |     34.3 |
| ≤2B  | **Hala‑350M**                          |   350M |    51.4 |       41.2 |  36.9 |      34.5 |     52.1 |       35.4 |     41.9 |
| ≤2B  | LiquidAI/LFM2-700M                     |   700M |    50.1 |       38.3 |  34.3 |      32.5 |     56.3 |       37.2 |     41.4 |
| ≤2B  | **Hala‑700M**                          |   700M |    55.5 |       45.9 |  40.6 |      34.7 |     65.2 |       39.4 |     46.9 |
| ≤2B  | LiquidAI/LFM2-1.2B                     |   1.2B |    53.8 |       45.2 |  35.0 |      34.7 |     65.6 |       43.4 |     46.3 |
| ≤2B  | **Hala‑1.2B**                          |   1.2B |    59.2 |       48.6 |  43.4 |      41.6 |     71.7 |       44.2 | **51.4** |

### 7B–9B parameters

| Size  | Model Name                                  | Params | AlGhafa | ArabicMMLU | EXAMS | MadinahQA | AraTrust | ArbMMLU‑HT |  Average |
| ----- | ------------------------------------------- | -----: | ------: | ---------: | ----: | --------: | -------: | ---------: | -------: |
| 7B–9B | CohereForAI/c4ai-command-r7b-arabic-02-2025 |     7B |    74.8 |       59.3 |  65.0 |      63.8 |     80.5 |       50.1 |     65.6 |
| 7B–9B | JasperV13/Yehia-7B-DPO-Reasoning-preview    |     7B |    75.1 |       66.3 |  51.8 |      54.9 |     81.9 |       55.1 |     64.2 |
| 7B–9B | Navid-AI/Yehia-7B-preview                   |     7B |    70.8 |       64.9 |  52.1 |      54.4 |     87.5 |       53.4 |     63.9 |
| 7B–9B | JasperV13/Yehia-7B-Reasoning-preview        |     7B |    75.2 |       66.3 |  52.7 |      55.0 |     80.8 |       55.2 |     64.2 |
| 7B–9B | ALLaM-AI/ALLaM-7B-Instruct-preview          |     7B |    69.5 |       64.9 |  51.6 |      54.2 |     86.9 |       52.8 |     63.3 |
| 7B–9B | Qwen/Qwen2-7B-Instruct                      |     7B |    73.2 |       60.0 |  47.3 |      59.5 |     82.8 |       51.3 |     62.4 |
| 7B–9B | Qwen/Qwen3-8B-Base                          |     8B |    74.8 |       65.0 |  52.5 |      52.2 |     83.4 |       61.5 |     64.9 |
| 7B–9B | QCRI/Fanar-1-9B-Instruct                    |     9B |    76.4 |       65.8 |  52.7 |      73.3 |     88.3 |       58.6 |     69.2 |
| 7B–9B | **Hala‑9B**                                 |     9B |    78.3 |       65.6 |  53.8 |      70.4 |     89.6 |       61.4 | **69.9** |

> **Evaluation protocol**: `lighteval` on **ArabicMMLU (OALL‑2)** excluding AlRage.

---

## 📚 Citation

If you find **Hala** useful, please cite:

```bibtex
@misc{hammoud2025halatechnicalreportbuilding,
      title={Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale}, 
      author={Hasan Abed Al Kader Hammoud and Mohammad Zbeeb and Bernard Ghanem},
      year={2025},
      url={https://arxiv.org/abs/2509.14008}, 
}
```