Update README.md
Browse files
README.md
CHANGED
|
@@ -25,8 +25,99 @@ It is designed for applications where computational resources are limited or whe
|
|
| 25 |
Model2Vec models are the smallest, fastest, and most performant static embedders available.
|
| 26 |
The distilled models are can be up to 50 times smaller and 500 times faster than traditional Sentence Transformers.
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
## Benchmark on Arabic
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
| Model | Avg | MIRAC | MLQAR | Massi | Multi | STS17 | STS22 | XNLI_ |
|
| 31 |
|---------------------------------------|-------|-------|-------|-------|-------|-------|-------|-------|
|
| 32 |
| arabic_triplet_matryoshka_v2 | 0.6610 | 0.6262 | 0.5093 | 0.5577 | 0.5868 | 0.8531 | 0.6396 | 0.8542 |
|
|
@@ -79,77 +170,6 @@ The distilled models are can be up to 50 times smaller and 500 times faster than
|
|
| 79 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.491 |
|
| 80 |
| all_minilm_l6_v2 | 0.252 |
|
| 81 |
|
| 82 |
-
## Speed
|
| 83 |
-
|
| 84 |
-
| Model | Speed (sentences/second) | Device |
|
| 85 |
-
|---------------------------------------|--------------------------|--------|
|
| 86 |
-
| zarra | 26893.63 | cpu |
|
| 87 |
-
| bojji | 27478.15 | cpu |
|
| 88 |
-
| potion-multilingual-128M | 27145.31 | cpu |
|
| 89 |
-
| paraphrase-multilingual-MiniLM-L12-v2 | 2363.24 | cuda |
|
| 90 |
-
| silma_ai_embedding_sts_v0.1 | 627.13 | cuda |
|
| 91 |
-
| muffakir_embedding | 621.77 | cuda |
|
| 92 |
-
| get_multilingual_base | 895.41 | cuda |
|
| 93 |
-
| arabic_retrieval_v1.0 | 618.56 | cuda |
|
| 94 |
-
| arabic_triplet_matryoshka_v2 | 610.64 | cuda |
|
| 95 |
-
|
| 96 |
-
## Size of the Model
|
| 97 |
-
|
| 98 |
-
| Model | Parameters (M) | Size (MB) | Relative to Largest (%) | Less than Largest (x) |
|
| 99 |
-
|----------------------------------|----------------|-----------|-------------------------|-----------------------|
|
| 100 |
-
| zarra | 64.00 | 244.14 | 41.92 | 2.39 |
|
| 101 |
-
| bojji | 124.88 | 476.40 | 81.79 | 1.22 |
|
| 102 |
-
| potion-multilingual-128M | 128.09 | 488.63 | 83.89 | 1.19 |
|
| 103 |
-
| paraphrase-multilingual-MiniLM-… | 117.65 | 448.82 | 77.06 | 1.30 |
|
| 104 |
-
| silma_ai_embedding_sts_v0.1 | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 105 |
-
| muffakir_embedding | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 106 |
-
| arabic_retrieval_v1.0 | 135.19 | 515.73 | 88.54 | 1.13 |
|
| 107 |
-
| arabic_triplet_matryoshka_v2 | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 108 |
-
| get_multilingual_base | 305.37 | 582.45 | 100.00 | 1.00 |
|
| 109 |
-
|
| 110 |
-
## Installation
|
| 111 |
-
|
| 112 |
-
Install model2vec using pip:
|
| 113 |
-
```
|
| 114 |
-
pip install model2vec
|
| 115 |
-
```
|
| 116 |
-
|
| 117 |
-
## Usage
|
| 118 |
-
|
| 119 |
-
### Using Model2Vec
|
| 120 |
-
|
| 121 |
-
The [Model2Vec library](https://github.com/MinishLab/model2vec) is the fastest and most lightweight way to run Model2Vec models.
|
| 122 |
-
|
| 123 |
-
Load this model using the `from_pretrained` method:
|
| 124 |
-
```python
|
| 125 |
-
from model2vec import StaticModel
|
| 126 |
-
|
| 127 |
-
# Load a pretrained Model2Vec model
|
| 128 |
-
model = StaticModel.from_pretrained("NAMAA-Space/zarra")
|
| 129 |
-
|
| 130 |
-
# Compute text embeddings
|
| 131 |
-
embeddings = model.encode(["Example sentence"])
|
| 132 |
-
```
|
| 133 |
-
|
| 134 |
-
### Using Sentence Transformers
|
| 135 |
-
|
| 136 |
-
You can also use the [Sentence Transformers library](https://github.com/UKPLab/sentence-transformers) to load and use the model:
|
| 137 |
-
|
| 138 |
-
```python
|
| 139 |
-
from sentence_transformers import SentenceTransformer
|
| 140 |
-
|
| 141 |
-
# Load a pretrained Sentence Transformer model
|
| 142 |
-
model = SentenceTransformer("NAMAA-Space/zarra")
|
| 143 |
-
|
| 144 |
-
# Compute text embeddings
|
| 145 |
-
embeddings = model.encode(["Example sentence"])
|
| 146 |
-
```
|
| 147 |
-
|
| 148 |
-
## How it Works
|
| 149 |
-
|
| 150 |
-
Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
|
| 151 |
-
|
| 152 |
-
It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using [SIF weighting](https://openreview.net/pdf?id=SyK00v5xx). During inference, we simply take the mean of all token embeddings occurring in a sentence.
|
| 153 |
|
| 154 |
## Additional Resources
|
| 155 |
|
|
|
|
| 25 |
Model2Vec models are the smallest, fastest, and most performant static embedders available.
|
| 26 |
The distilled models are can be up to 50 times smaller and 500 times faster than traditional Sentence Transformers.
|
| 27 |
|
| 28 |
+
## Installation
|
| 29 |
+
|
| 30 |
+
Install model2vec using pip:
|
| 31 |
+
```
|
| 32 |
+
pip install model2vec
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
## Usage
|
| 36 |
+
|
| 37 |
+
### Using Model2Vec
|
| 38 |
+
|
| 39 |
+
The [Model2Vec library](https://github.com/MinishLab/model2vec) is the fastest and most lightweight way to run Model2Vec models.
|
| 40 |
+
|
| 41 |
+
Load this model using the `from_pretrained` method:
|
| 42 |
+
```python
|
| 43 |
+
from model2vec import StaticModel
|
| 44 |
+
|
| 45 |
+
# Load a pretrained Model2Vec model
|
| 46 |
+
model = StaticModel.from_pretrained("NAMAA-Space/zarra")
|
| 47 |
+
|
| 48 |
+
# Compute text embeddings
|
| 49 |
+
embeddings = model.encode(["Example sentence"])
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
+
### Using Sentence Transformers
|
| 53 |
+
|
| 54 |
+
You can also use the [Sentence Transformers library](https://github.com/UKPLab/sentence-transformers) to load and use the model:
|
| 55 |
+
|
| 56 |
+
```python
|
| 57 |
+
from sentence_transformers import SentenceTransformer
|
| 58 |
+
|
| 59 |
+
# Load a pretrained Sentence Transformer model
|
| 60 |
+
model = SentenceTransformer("NAMAA-Space/zarra")
|
| 61 |
+
|
| 62 |
+
# Compute text embeddings
|
| 63 |
+
embeddings = model.encode(["Example sentence"])
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
## How it Works
|
| 67 |
+
|
| 68 |
+
Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
|
| 69 |
+
|
| 70 |
+
It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using [SIF weighting](https://openreview.net/pdf?id=SyK00v5xx). During inference, we simply take the mean of all token embeddings occurring in a sentence.
|
| 71 |
+
|
| 72 |
+
|
| 73 |
## Benchmark on Arabic
|
| 74 |
|
| 75 |
+
|
| 76 |
+
## Speed
|
| 77 |
+
|
| 78 |
+
| Model | Speed (sentences/second) | Device |
|
| 79 |
+
|---------------------------------------|--------------------------|--------|
|
| 80 |
+
| zarra | 26893.63 | cpu |
|
| 81 |
+
| bojji | 27478.15 | cpu |
|
| 82 |
+
| potion-multilingual-128M | 27145.31 | cpu |
|
| 83 |
+
| paraphrase-multilingual-MiniLM-L12-v2 | 2363.24 | cuda |
|
| 84 |
+
| silma_ai_embedding_sts_v0.1 | 627.13 | cuda |
|
| 85 |
+
| muffakir_embedding | 621.77 | cuda |
|
| 86 |
+
| get_multilingual_base | 895.41 | cuda |
|
| 87 |
+
| arabic_retrieval_v1.0 | 618.56 | cuda |
|
| 88 |
+
| arabic_triplet_matryoshka_v2 | 610.64 | cuda |
|
| 89 |
+
|
| 90 |
+
- Zarra and Bojji excel in speed, achieving 26893.63 and 27478.15 sentences per second on CPU, respectively, far surpassing CUDA-based models like arabic_triplet_matryoshka_v2 (610.64).
|
| 91 |
+
|
| 92 |
+
- Top Performer: Bojji is the fastest model, slightly ahead of Zarra and potion-multilingual-128M (27145.31), highlighting the efficiency of Model2Vec-based models on CPU.
|
| 93 |
+
|
| 94 |
+
- Key Observation: The high speed of Zarra and Bojji on CPU makes them ideal for resource-constrained environments, offering significant advantages over CUDA-dependent models.
|
| 95 |
+
|
| 96 |
+
## Size of the Model
|
| 97 |
+
|
| 98 |
+
| Model | Parameters (M) | Size (MB) | Relative to Largest (%) | Less than Largest (x) |
|
| 99 |
+
|----------------------------------|----------------|-----------|-------------------------|-----------------------|
|
| 100 |
+
| zarra | 64.00 | 244.14 | 41.92 | 2.39 |
|
| 101 |
+
| bojji | 124.88 | 476.40 | 81.79 | 1.22 |
|
| 102 |
+
| potion-multilingual-128M | 128.09 | 488.63 | 83.89 | 1.19 |
|
| 103 |
+
| paraphrase-multilingual-MiniLM-… | 117.65 | 448.82 | 77.06 | 1.30 |
|
| 104 |
+
| silma_ai_embedding_sts_v0.1 | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 105 |
+
| muffakir_embedding | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 106 |
+
| arabic_retrieval_v1.0 | 135.19 | 515.73 | 88.54 | 1.13 |
|
| 107 |
+
| arabic_triplet_matryoshka_v2 | 135.19 | 515.72 | 88.54 | 1.13 |
|
| 108 |
+
| get_multilingual_base | 305.37 | 582.45 | 100.00 | 1.00 |
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
- Zarra is the smallest model, with only 64 million parameters and 244.14 MB in size, making it 2.39 times smaller than the largest model (get_multilingual_base).
|
| 113 |
+
|
| 114 |
+
- Bojji is slightly larger at 124.88 million parameters and 476.40 MB, but still significantly smaller than most other models.
|
| 115 |
+
|
| 116 |
+
- Top Performer: Zarra leads in compactness, offering the smallest footprint, which is critical for deployment on resource-limited devices.
|
| 117 |
+
|
| 118 |
+
- Key Observation: The compact size of Zarra and Bojji aligns with their design goal of efficiency, making them highly suitable for edge computing and real-time applications.
|
| 119 |
+
|
| 120 |
+
|
| 121 |
| Model | Avg | MIRAC | MLQAR | Massi | Multi | STS17 | STS22 | XNLI_ |
|
| 122 |
|---------------------------------------|-------|-------|-------|-------|-------|-------|-------|-------|
|
| 123 |
| arabic_triplet_matryoshka_v2 | 0.6610 | 0.6262 | 0.5093 | 0.5577 | 0.5868 | 0.8531 | 0.6396 | 0.8542 |
|
|
|
|
| 170 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.491 |
|
| 171 |
| all_minilm_l6_v2 | 0.252 |
|
| 172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
|
| 174 |
## Additional Resources
|
| 175 |
|