Update README.md
Browse files
README.md
CHANGED
|
@@ -280,11 +280,34 @@ model-index:
|
|
| 280 |
|
| 281 |
The successor of German_Semantic_STS_V2 is here!
|
| 282 |
|
| 283 |
-
## Major updates:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 284 |
|
| 285 |
-
- **Sequence length: 8192, (16 times more than V2 and other models) => thanks to the alibi implementation of Jina-Team!**
|
| 286 |
-
- **Matryoshka Embeddings: Your embeddings can be sized from 1024 down to 64**
|
| 287 |
-
- **License: Apache 2.0**
|
| 288 |
|
| 289 |
|
| 290 |
## Model Details
|
|
|
|
| 280 |
|
| 281 |
The successor of German_Semantic_STS_V2 is here!
|
| 282 |
|
| 283 |
+
## Major updates and USPs:
|
| 284 |
+
|
| 285 |
+
- **Sequence length:** 8192, (16 times more than V2 and other models) => thanks to the ALiBi implementation of Jina-Team!
|
| 286 |
+
- **Matryoshka Embeddings:** The model is trained for embedding sizes from 1024 down to 64, allowing you to store much smaller embeddings with little quality loss.
|
| 287 |
+
- **License:** Apache 2.0
|
| 288 |
+
- **German only:** This model is German-only, causing the model to learn more efficient and deal better with shorter queries.
|
| 289 |
+
|
| 290 |
+
## Usage:
|
| 291 |
+
|
| 292 |
+
```python
|
| 293 |
+
from sentence_transformers import SentenceTransformer
|
| 294 |
+
|
| 295 |
+
|
| 296 |
+
matryoshka_dim = 1024 # How big your embeddings should be, choose from: 64, 128, 256, 512, 1024
|
| 297 |
+
model = SentenceTransformer("aari1995/German_Semantic_V3", trust_remote_code=True, truncate_dim=matryoshka_dim)
|
| 298 |
+
|
| 299 |
+
# Run inference
|
| 300 |
+
sentences = [
|
| 301 |
+
'Eine Flagge weht.',
|
| 302 |
+
'Die Flagge bewegte sich in der Luft.',
|
| 303 |
+
'Zwei Personen beobachten das Wasser.',
|
| 304 |
+
]
|
| 305 |
+
embeddings = model.encode(sentences)
|
| 306 |
+
|
| 307 |
+
# Get the similarity scores for the embeddings
|
| 308 |
+
similarities = model.similarity(embeddings, embeddings)
|
| 309 |
+
```
|
| 310 |
|
|
|
|
|
|
|
|
|
|
| 311 |
|
| 312 |
|
| 313 |
## Model Details
|