aari1995
/

German_Semantic_V3b

Sentence Similarity

sentence-transformers

feature-extraction

loss:MatryoshkaLoss

text-embeddings-inference

Model card Files Files and versions

aari1995 commited on Jun 19, 2024

Commit

8d09561

·

verified ·

1 Parent(s): b0ded5f

Update README.md

Files changed (1) hide show

README.md +27 -4

README.md CHANGED Viewed

@@ -280,11 +280,34 @@ model-index:
 The successor of German_Semantic_STS_V2 is here!
-## Major updates:
-- **Sequence length: 8192, (16 times more than V2 and other models) => thanks to the alibi implementation of Jina-Team!**
-- **Matryoshka Embeddings: Your embeddings can be sized from 1024 down to 64**
-- **License: Apache 2.0**
 ## Model Details

 The successor of German_Semantic_STS_V2 is here!
+## Major updates and USPs:
+- **Sequence length:** 8192, (16 times more than V2 and other models) => thanks to the ALiBi implementation of Jina-Team!
+- **Matryoshka Embeddings:** The model is trained for embedding sizes from 1024 down to 64, allowing you to store much smaller embeddings with little quality loss.
+- **License:** Apache 2.0
+- **German only:** This model is German-only, causing the model to learn more efficient and deal better with shorter queries.
+## Usage:
+```python
+from sentence_transformers import SentenceTransformer
+matryoshka_dim = 1024 # How big your embeddings should be, choose from: 64, 128, 256, 512, 1024
+model = SentenceTransformer("aari1995/German_Semantic_V3", trust_remote_code=True, truncate_dim=matryoshka_dim)
+# Run inference
+sentences = [
+    'Eine Flagge weht.',
+    'Die Flagge bewegte sich in der Luft.',
+    'Zwei Personen beobachten das Wasser.',
+]
+embeddings = model.encode(sentences)
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+```
 ## Model Details