Abdelkareem commited on
Commit
61d04f1
·
verified ·
1 Parent(s): f95920c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -71
README.md CHANGED
@@ -25,8 +25,99 @@ It is designed for applications where computational resources are limited or whe
25
  Model2Vec models are the smallest, fastest, and most performant static embedders available.
26
  The distilled models are can be up to 50 times smaller and 500 times faster than traditional Sentence Transformers.
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  ## Benchmark on Arabic
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  | Model | Avg | MIRAC | MLQAR | Massi | Multi | STS17 | STS22 | XNLI_ |
31
  |---------------------------------------|-------|-------|-------|-------|-------|-------|-------|-------|
32
  | arabic_triplet_matryoshka_v2 | 0.6610 | 0.6262 | 0.5093 | 0.5577 | 0.5868 | 0.8531 | 0.6396 | 0.8542 |
@@ -79,77 +170,6 @@ The distilled models are can be up to 50 times smaller and 500 times faster than
79
  | paraphrase-multilingual-MiniLM-L12-v2 | 0.491 |
80
  | all_minilm_l6_v2 | 0.252 |
81
 
82
- ## Speed
83
-
84
- | Model | Speed (sentences/second) | Device |
85
- |---------------------------------------|--------------------------|--------|
86
- | zarra | 26893.63 | cpu |
87
- | bojji | 27478.15 | cpu |
88
- | potion-multilingual-128M | 27145.31 | cpu |
89
- | paraphrase-multilingual-MiniLM-L12-v2 | 2363.24 | cuda |
90
- | silma_ai_embedding_sts_v0.1 | 627.13 | cuda |
91
- | muffakir_embedding | 621.77 | cuda |
92
- | get_multilingual_base | 895.41 | cuda |
93
- | arabic_retrieval_v1.0 | 618.56 | cuda |
94
- | arabic_triplet_matryoshka_v2 | 610.64 | cuda |
95
-
96
- ## Size of the Model
97
-
98
- | Model | Parameters (M) | Size (MB) | Relative to Largest (%) | Less than Largest (x) |
99
- |----------------------------------|----------------|-----------|-------------------------|-----------------------|
100
- | zarra | 64.00 | 244.14 | 41.92 | 2.39 |
101
- | bojji | 124.88 | 476.40 | 81.79 | 1.22 |
102
- | potion-multilingual-128M | 128.09 | 488.63 | 83.89 | 1.19 |
103
- | paraphrase-multilingual-MiniLM-… | 117.65 | 448.82 | 77.06 | 1.30 |
104
- | silma_ai_embedding_sts_v0.1 | 135.19 | 515.72 | 88.54 | 1.13 |
105
- | muffakir_embedding | 135.19 | 515.72 | 88.54 | 1.13 |
106
- | arabic_retrieval_v1.0 | 135.19 | 515.73 | 88.54 | 1.13 |
107
- | arabic_triplet_matryoshka_v2 | 135.19 | 515.72 | 88.54 | 1.13 |
108
- | get_multilingual_base | 305.37 | 582.45 | 100.00 | 1.00 |
109
-
110
- ## Installation
111
-
112
- Install model2vec using pip:
113
- ```
114
- pip install model2vec
115
- ```
116
-
117
- ## Usage
118
-
119
- ### Using Model2Vec
120
-
121
- The [Model2Vec library](https://github.com/MinishLab/model2vec) is the fastest and most lightweight way to run Model2Vec models.
122
-
123
- Load this model using the `from_pretrained` method:
124
- ```python
125
- from model2vec import StaticModel
126
-
127
- # Load a pretrained Model2Vec model
128
- model = StaticModel.from_pretrained("NAMAA-Space/zarra")
129
-
130
- # Compute text embeddings
131
- embeddings = model.encode(["Example sentence"])
132
- ```
133
-
134
- ### Using Sentence Transformers
135
-
136
- You can also use the [Sentence Transformers library](https://github.com/UKPLab/sentence-transformers) to load and use the model:
137
-
138
- ```python
139
- from sentence_transformers import SentenceTransformer
140
-
141
- # Load a pretrained Sentence Transformer model
142
- model = SentenceTransformer("NAMAA-Space/zarra")
143
-
144
- # Compute text embeddings
145
- embeddings = model.encode(["Example sentence"])
146
- ```
147
-
148
- ## How it Works
149
-
150
- Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
151
-
152
- It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using [SIF weighting](https://openreview.net/pdf?id=SyK00v5xx). During inference, we simply take the mean of all token embeddings occurring in a sentence.
153
 
154
  ## Additional Resources
155
 
 
25
  Model2Vec models are the smallest, fastest, and most performant static embedders available.
26
  The distilled models are can be up to 50 times smaller and 500 times faster than traditional Sentence Transformers.
27
 
28
+ ## Installation
29
+
30
+ Install model2vec using pip:
31
+ ```
32
+ pip install model2vec
33
+ ```
34
+
35
+ ## Usage
36
+
37
+ ### Using Model2Vec
38
+
39
+ The [Model2Vec library](https://github.com/MinishLab/model2vec) is the fastest and most lightweight way to run Model2Vec models.
40
+
41
+ Load this model using the `from_pretrained` method:
42
+ ```python
43
+ from model2vec import StaticModel
44
+
45
+ # Load a pretrained Model2Vec model
46
+ model = StaticModel.from_pretrained("NAMAA-Space/zarra")
47
+
48
+ # Compute text embeddings
49
+ embeddings = model.encode(["Example sentence"])
50
+ ```
51
+
52
+ ### Using Sentence Transformers
53
+
54
+ You can also use the [Sentence Transformers library](https://github.com/UKPLab/sentence-transformers) to load and use the model:
55
+
56
+ ```python
57
+ from sentence_transformers import SentenceTransformer
58
+
59
+ # Load a pretrained Sentence Transformer model
60
+ model = SentenceTransformer("NAMAA-Space/zarra")
61
+
62
+ # Compute text embeddings
63
+ embeddings = model.encode(["Example sentence"])
64
+ ```
65
+
66
+ ## How it Works
67
+
68
+ Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
69
+
70
+ It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using [SIF weighting](https://openreview.net/pdf?id=SyK00v5xx). During inference, we simply take the mean of all token embeddings occurring in a sentence.
71
+
72
+
73
  ## Benchmark on Arabic
74
 
75
+
76
+ ## Speed
77
+
78
+ | Model | Speed (sentences/second) | Device |
79
+ |---------------------------------------|--------------------------|--------|
80
+ | zarra | 26893.63 | cpu |
81
+ | bojji | 27478.15 | cpu |
82
+ | potion-multilingual-128M | 27145.31 | cpu |
83
+ | paraphrase-multilingual-MiniLM-L12-v2 | 2363.24 | cuda |
84
+ | silma_ai_embedding_sts_v0.1 | 627.13 | cuda |
85
+ | muffakir_embedding | 621.77 | cuda |
86
+ | get_multilingual_base | 895.41 | cuda |
87
+ | arabic_retrieval_v1.0 | 618.56 | cuda |
88
+ | arabic_triplet_matryoshka_v2 | 610.64 | cuda |
89
+
90
+ - Zarra and Bojji excel in speed, achieving 26893.63 and 27478.15 sentences per second on CPU, respectively, far surpassing CUDA-based models like arabic_triplet_matryoshka_v2 (610.64).
91
+
92
+ - Top Performer: Bojji is the fastest model, slightly ahead of Zarra and potion-multilingual-128M (27145.31), highlighting the efficiency of Model2Vec-based models on CPU.
93
+
94
+ - Key Observation: The high speed of Zarra and Bojji on CPU makes them ideal for resource-constrained environments, offering significant advantages over CUDA-dependent models.
95
+
96
+ ## Size of the Model
97
+
98
+ | Model | Parameters (M) | Size (MB) | Relative to Largest (%) | Less than Largest (x) |
99
+ |----------------------------------|----------------|-----------|-------------------------|-----------------------|
100
+ | zarra | 64.00 | 244.14 | 41.92 | 2.39 |
101
+ | bojji | 124.88 | 476.40 | 81.79 | 1.22 |
102
+ | potion-multilingual-128M | 128.09 | 488.63 | 83.89 | 1.19 |
103
+ | paraphrase-multilingual-MiniLM-… | 117.65 | 448.82 | 77.06 | 1.30 |
104
+ | silma_ai_embedding_sts_v0.1 | 135.19 | 515.72 | 88.54 | 1.13 |
105
+ | muffakir_embedding | 135.19 | 515.72 | 88.54 | 1.13 |
106
+ | arabic_retrieval_v1.0 | 135.19 | 515.73 | 88.54 | 1.13 |
107
+ | arabic_triplet_matryoshka_v2 | 135.19 | 515.72 | 88.54 | 1.13 |
108
+ | get_multilingual_base | 305.37 | 582.45 | 100.00 | 1.00 |
109
+
110
+
111
+
112
+ - Zarra is the smallest model, with only 64 million parameters and 244.14 MB in size, making it 2.39 times smaller than the largest model (get_multilingual_base).
113
+
114
+ - Bojji is slightly larger at 124.88 million parameters and 476.40 MB, but still significantly smaller than most other models.
115
+
116
+ - Top Performer: Zarra leads in compactness, offering the smallest footprint, which is critical for deployment on resource-limited devices.
117
+
118
+ - Key Observation: The compact size of Zarra and Bojji aligns with their design goal of efficiency, making them highly suitable for edge computing and real-time applications.
119
+
120
+
121
  | Model | Avg | MIRAC | MLQAR | Massi | Multi | STS17 | STS22 | XNLI_ |
122
  |---------------------------------------|-------|-------|-------|-------|-------|-------|-------|-------|
123
  | arabic_triplet_matryoshka_v2 | 0.6610 | 0.6262 | 0.5093 | 0.5577 | 0.5868 | 0.8531 | 0.6396 | 0.8542 |
 
170
  | paraphrase-multilingual-MiniLM-L12-v2 | 0.491 |
171
  | all_minilm_l6_v2 | 0.252 |
172
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
173
 
174
  ## Additional Resources
175