Sentence Similarity
sentence-transformers
Safetensors
Russian
xlm-roberta
feature-extraction
text-embeddings-inference
Instructions to use deepvk/USER-bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use deepvk/USER-bge-m3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("deepvk/USER-bge-m3") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Inference
- Notebooks
- Google Colab
- Kaggle
Corrected a typo in the 'TatonkaHF/bge-m3_en_ru' model name in the Initialization section.
#2
by shlepakov - opened
README.md
CHANGED
|
@@ -108,7 +108,7 @@ Also, you can use native [FlagEmbedding](https://github.com/FlagOpen/FlagEmbeddi
|
|
| 108 |
We follow the [`USER-base`](https://huggingface.co/deepvk/USER-base) model training algorithm, with several changes as we use different backbone.
|
| 109 |
|
| 110 |
|
| 111 |
-
**Initialization:** [`TatonkaHF/bge-
|
| 112 |
|
| 113 |
|
| 114 |
**Fine-tuning:** Supervised fine-tuning two different models based on data symmetry and then merging via [`LM-Cocktail`](https://arxiv.org/abs/2311.13534):
|
|
|
|
| 108 |
We follow the [`USER-base`](https://huggingface.co/deepvk/USER-base) model training algorithm, with several changes as we use different backbone.
|
| 109 |
|
| 110 |
|
| 111 |
+
**Initialization:** [`TatonkaHF/bge-m3_en_ru`](https://huggingface.co/TatonkaHF/bge-m3_en_ru) – shrinked version of [`baai/bge-m3`](https://huggingface.co/BAAI/bge-m3) to support only Russian and English tokens.
|
| 112 |
|
| 113 |
|
| 114 |
**Fine-tuning:** Supervised fine-tuning two different models based on data symmetry and then merging via [`LM-Cocktail`](https://arxiv.org/abs/2311.13534):
|