Update README.md
Browse files
README.md
CHANGED
|
@@ -177,7 +177,7 @@ You can finetune this model on your own dataset.
|
|
| 177 |
| sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 118 | 0.456867 | 0.21345 | 0.67409 | 0.25676 | 0.45903 | 0.71491 | 0.42296 | nan |
|
| 178 |
|
| 179 |
#### Performance Comparison by Model Size (Based on Average NDCG@10)
|
| 180 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/642b0c2fecec03b4464a1d9b/Ba2bVpPlB7egF80USITJ5.png" width="
|
| 181 |
|
| 182 |
|
| 183 |
<!--
|
|
@@ -364,7 +364,7 @@ pip install -U sentence-transformers
|
|
| 364 |
- Tokenizers: 0.21.1
|
| 365 |
|
| 366 |
## FAQ
|
| 367 |
-
1. Do I need to add the prefix "query: " and "passage: " to input texts
|
| 368 |
|
| 369 |
Yes, this is how the model is trained, otherwise you will see a performance degradation.
|
| 370 |
|
|
@@ -376,7 +376,7 @@ Use "query: " prefix for symmetric tasks such as semantic similarity, bitext min
|
|
| 376 |
|
| 377 |
Use "query: " prefix if you want to use embeddings as features, such as linear probing classification, clustering.
|
| 378 |
|
| 379 |
-
2. Why does the cosine similarity scores distribute around 0.7 to 1.0
|
| 380 |
|
| 381 |
This is a known and expected behavior as we use a low temperature 0.01 for InfoNCE contrastive loss.
|
| 382 |
|
|
|
|
| 177 |
| sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 118 | 0.456867 | 0.21345 | 0.67409 | 0.25676 | 0.45903 | 0.71491 | 0.42296 | nan |
|
| 178 |
|
| 179 |
#### Performance Comparison by Model Size (Based on Average NDCG@10)
|
| 180 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/642b0c2fecec03b4464a1d9b/Ba2bVpPlB7egF80USITJ5.png" width="1000"/>
|
| 181 |
|
| 182 |
|
| 183 |
<!--
|
|
|
|
| 364 |
- Tokenizers: 0.21.1
|
| 365 |
|
| 366 |
## FAQ
|
| 367 |
+
**1. Do I need to add the prefix "query: " and "passage: " to input texts?**
|
| 368 |
|
| 369 |
Yes, this is how the model is trained, otherwise you will see a performance degradation.
|
| 370 |
|
|
|
|
| 376 |
|
| 377 |
Use "query: " prefix if you want to use embeddings as features, such as linear probing classification, clustering.
|
| 378 |
|
| 379 |
+
**2. Why does the cosine similarity scores distribute around 0.7 to 1.0?**
|
| 380 |
|
| 381 |
This is a known and expected behavior as we use a low temperature 0.01 for InfoNCE contrastive loss.
|
| 382 |
|