File size: 10,493 Bytes
adabfda
 
 
9003288
adabfda
 
 
 
4db9395
 
 
 
 
 
 
adabfda
 
0772ca3
02116b5
757aacc
 
 
 
 
0772ca3
 
 
da95b58
0772ca3
 
da95b58
0772ca3
 
 
 
 
 
5510154
 
 
 
 
 
0772ca3
 
 
 
 
f95920c
61d04f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da95b58
 
61d04f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f95920c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
adabfda
 
 
02116b5
f95920c
02116b5
f95920c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
---
library_name: model2vec
license: mit
model_name: Abdelkareem/zarra
tags:
- embeddings
- static-embeddings
- sentence-transformers
datasets:
- allenai/c4
language:
- ar
base_model:
- jinaai/jina-embeddings-v3
pipeline_tag: sentence-similarity
---

# Zarra: Arabic Static Embedding Model



![image/png](https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/t4ALUMHL25wTuNzgNQwUg.png)


**Zarra** is a static embedding model built using the Model2Vec distillation framework. 
It is a distilled version of a Sentence Transformer, specifically optimized for the Arabic language.
Unlike traditional transformer-based models, Zarra produces static embeddings, enabling ultra-fast inference on both CPU and GPU—making it ideal for resource-constrained environments or real-time applications.

## Why Zarra?
⚡ Exceptional Speed: Delivers embeddings up to 500x faster than sentence transformers.

🧠 Compact & Efficient: Up to 50x smaller in size, allowing easy deployment on edge devices.

🧰 Versatile: Well-suited for search, clustering, classification, deduplication, and more.

🌍 Arabic-First: Specifically trained on high-quality Arabic data, ensuring relevance and performance across a range of Arabic NLP tasks.


<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/3JEnPfgF2BfbN5H81K0XD.png" alt="Speed vs Performance Chart" width="700"/>
</p>


## About Model2Vec


The Model2Vec distillation technique transfers knowledge from large transformer models into lightweight static embedding spaces, preserving semantic quality while dramatically improving speed and efficiency.
Zarra represents the best of both worlds: the semantic power of transformers and the speed and simplicity of static vectors.

## Installation

Install model2vec using pip:
```
pip install model2vec
```

## Usage

### Using Model2Vec

The [Model2Vec library](https://github.com/MinishLab/model2vec) is the fastest and most lightweight way to run Model2Vec models.

Load this model using the `from_pretrained` method:
```python
from model2vec import StaticModel

# Load a pretrained Model2Vec model
model = StaticModel.from_pretrained("NAMAA-Space/zarra")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])
```

### Using Sentence Transformers

You can also use the [Sentence Transformers library](https://github.com/UKPLab/sentence-transformers) to load and use the model:

```python
from sentence_transformers import SentenceTransformer

# Load a pretrained Sentence Transformer model
model = SentenceTransformer("NAMAA-Space/zarra")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])
```

## How it Works

Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.

It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using [SIF weighting](https://openreview.net/pdf?id=SyK00v5xx). During inference, we simply take the mean of all token embeddings occurring in a sentence.


## Benchmark on Arabic


## Speed

| Model                                 | Speed (sentences/second) | Device |
|---------------------------------------|--------------------------|--------|
| zarra                                 | 26893.63                 | cpu    |
| bojji                                 | 27478.15                 | cpu    |
| potion-multilingual-128M              | 27145.31                 | cpu    |
| paraphrase-multilingual-MiniLM-L12-v2 | 2363.24                  | cuda   |
| silma_ai_embedding_sts_v0.1           | 627.13                   | cuda   |
| muffakir_embedding                    | 621.77                   | cuda   |
| get_multilingual_base                 | 895.41                   | cuda   |
| arabic_retrieval_v1.0                 | 618.56                   | cuda   |
| arabic_triplet_matryoshka_v2          | 610.64                   | cuda   |

- Zarra and Bojji excel in speed, achieving 26893.63 and 27478.15 sentences per second on CPU, respectively, far surpassing CUDA-based models like arabic_triplet_matryoshka_v2 (610.64).

- Top Performer: Bojji is the fastest model, slightly ahead of Zarra and potion-multilingual-128M (27145.31), highlighting the efficiency of Model2Vec-based models on CPU.

- Key Observation: The high speed of Zarra and Bojji on CPU makes them ideal for resource-constrained environments, offering significant advantages over CUDA-dependent models.

## Size of the Model

| Model                            | Parameters (M) | Size (MB) | Relative to Largest (%) | Less than Largest (x) |
|----------------------------------|----------------|-----------|-------------------------|-----------------------|
| zarra                            | 64.00          | 244.14    | 41.92                   | 2.39                  |
| bojji                            | 124.88         | 476.40    | 81.79                   | 1.22                  |
| potion-multilingual-128M         | 128.09         | 488.63    | 83.89                   | 1.19                  |
| paraphrase-multilingual-MiniLM-… | 117.65         | 448.82    | 77.06                   | 1.30                  |
| silma_ai_embedding_sts_v0.1      | 135.19         | 515.72    | 88.54                   | 1.13                  |
| muffakir_embedding               | 135.19         | 515.72    | 88.54                   | 1.13                  |
| arabic_retrieval_v1.0            | 135.19         | 515.73    | 88.54                   | 1.13                  |
| arabic_triplet_matryoshka_v2     | 135.19         | 515.72    | 88.54                   | 1.13                  |
| get_multilingual_base            | 305.37         | 582.45    | 100.00                  | 1.00                  |



- Zarra is the smallest model, with only 64 million parameters and 244.14 MB in size, making it 2.39 times smaller than the largest model (get_multilingual_base).

- Bojji is slightly larger at 124.88 million parameters and 476.40 MB, but still significantly smaller than most other models.

- Top Performer: Zarra leads in compactness, offering the smallest footprint, which is critical for deployment on resource-limited devices.

- Key Observation: The compact size of Zarra and Bojji aligns with their design goal of efficiency, making them highly suitable for edge computing and real-time applications.


| Model                                 | Avg   | MIRAC | MLQAR | Massi | Multi | STS17 | STS22 | XNLI_ |
|---------------------------------------|-------|-------|-------|-------|-------|-------|-------|-------|
| arabic_triplet_matryoshka_v2          | 0.6610 | 0.6262 | 0.5093 | 0.5577 | 0.5868 | 0.8531 | 0.6396 | 0.8542 |
| muffakir_embedding                    | 0.6494 | 0.6424 | 0.5267 | 0.5462 | 0.5943 | 0.8485 | 0.6291 | 0.7583 |
| arabic_retrieval_v1.0                 | 0.6473 | 0.6159 | 0.5674 | 0.5832 | 0.5993 | 0.8002 | 0.6254 | 0.7393 |
| gate_arabert-v1                       | 0.6444 | 0.5774 | 0.4808 | 0.5345 | 0.5847 | 0.8278 | 0.6310 | 0.8746 |
| get_multilingual_base                 | 0.6440 | 0.7177 | 0.5698 | 0.5071 | 0.5521 | 0.7881 | 0.6145 | 0.7584 |
| arabic_sts_matryoshka                 | 0.6413 | 0.5828 | 0.4840 | 0.5457 | 0.5494 | 0.8290 | 0.6242 | 0.8740 |
| silma_ai_embedding_sts_v0.1           | 0.6138 | 0.3799 | 0.5011 | 0.5600 | 0.5749 | 0.8559 | 0.6122 | 0.8125 |
| Arabic-MiniLM-L12-v2-all-nli-triplet  | 0.5431 | 0.2240 | 0.3612 | 0.4775 | 0.5698 | 0.8111 | 0.5540 | 0.8043 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.5208 | 0.2191 | 0.3496 | 0.4515 | 0.5573 | 0.7916 | 0.4908 | 0.7859 |
| bojji                                 | 0.5177 | 0.2941 | 0.3989 | 0.4667 | 0.5433 | 0.7233 | 0.5880 | 0.6094 |
| zarra                                 | 0.4822 | 0.2295 | 0.3473 | 0.4119 | 0.5237 | 0.6469 | 0.6218 | 0.5942 |
| potion-multilingual-128M              | 0.4699 | 0.1658 | 0.3150 | 0.4285 | 0.5338 | 0.6511 | 0.5951 | 0.5999 |
| all_minilm_l6_v2                      | 0.2843 | 0.0005 | 0.0064 | 0.1905 | 0.4934 | 0.5089 | 0.2518 | 0.5384 |

### Sorted by STS17_main (Score)

| Model Name                            | STS17_main |
|---------------------------------------|------------|
| silma_ai_embedding_sts_v0.1           | 0.856      |
| arabic_triplet_matryoshka_v2          | 0.853      |
| muffakir_embedding                    | 0.849      |
| arabic_sts_matryoshka                 | 0.829      |
| gate_arabert-v1                       | 0.828      |
| Arabic-MiniLM-L12-v2-all-nli-triplet  | 0.811      |
| arabic_retrieval_v1.0                 | 0.800      |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.792      |
| get_multilingual_base                 | 0.788      |
| bojji                                 | 0.723      |
| potion-multilingual-128M              | 0.651      |
| zarra                                 | 0.647      |
| all_minilm_l6_v2                      | 0.509      |

### Sorted by STS22.v2_main (Score)

| Model Name                            | STS22.v2_main |
|---------------------------------------|---------------|
| arabic_triplet_matryoshka_v2          | 0.640         |
| gate_arabert-v1                       | 0.631         |
| muffakir_embedding                    | 0.629         |
| arabic_retrieval_v1.0                 | 0.625         |
| arabic_sts_matryoshka                 | 0.624         |
| zarra                                 | 0.622         |
| get_multilingual_base                 | 0.615         |
| silma_ai_embedding_sts_v0.1           | 0.612         |
| potion-multilingual-128M              | 0.595         |
| bojji                                 | 0.588         |
| Arabic-MiniLM-L12-v2-all-nli-triplet  | 0.554         |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.491         |
| all_minilm_l6_v2                      | 0.252         |


## Additional Resources

- [Zarra & Bojji Blog](https://kareemai.com/blog/posts/minishlab/blog_zaraah.html)
- [NAMAA Collection](https://huggingface.co/collections/NAMAA-Space/zaraah-683f1f8a1eec1ec8f2badee5)
- [MinishLab](https://minishlab.github.io/)
- [Model2Vec Repo](https://github.com/MinishLab/model2vec)