| | --- |
| | pipeline_tag: sentence-similarity |
| | tags: |
| | - feature-extraction |
| | license: mit |
| | language: |
| | - fr |
| | - en |
| | model-index: |
| | - name: Solon-embeddings-base-0.1 |
| | results: |
| | - task: |
| | type: sentence-similarity |
| | name: Passage Retrieval |
| | dataset: |
| | type: unicamp-dl/mmarco |
| | name: mMARCO-fr |
| | config: french |
| | split: validation |
| | metrics: |
| | - type: recall_at_500 |
| | name: Recall@500 |
| | value: 90.9 |
| | - type: recall_at_100 |
| | name: Recall@100 |
| | value: 80.6 |
| | - type: recall_at_10 |
| | name: Recall@10 |
| | value: 52.5 |
| | - type: map_at_10 |
| | name: MAP@10 |
| | value: 27.4 |
| | - type: ndcg_at_10 |
| | name: nDCG@10 |
| | value: 33.5 |
| | - type: mrr_at_10 |
| | name: MRR@10 |
| | value: 27.9 |
| | --- |
| | |
| | # Solon Embeddings — Base 0.1 |
| | SOTA Open source french embedding model. |
| |
|
| | **Instructions :** |
| | Add "query : " before the *query* to retrieve to increase performance of retrieval. |
| | No instructions needed for *passages*. |
| |
|
| |
|
| | | Model | Mean Score | |
| | | --- | --- | |
| | | **OrdalieTech/Solon-embeddings-large-0.1** | 0.7490 | |
| | | cohere/embed-multilingual-v3 | 0.7402 | |
| | | **OrdalieTech/Solon-embeddings-base-0.1** | 0.7306 | |
| | | openai/ada-002 | 0.7290 | |
| | | cohere/embed-multilingual-light-v3 | 0.6945 | |
| | | antoinelouis/biencoder-camembert-base-mmarcoFR | 0.6826 | |
| | | dangvantuan/sentence-camembert-large | 0.6756 | |
| | | voyage/voyage-01 | 0.6753 | |
| | | intfloat/multilingual-e5-large | 0.6660 | |
| | | intfloat/multilingual-e5-base | 0.6597 | |
| | | Sbert/paraphrase-multilingual-mpnet-base-v2 | 0.5975 | |
| | | dangvantuan/sentence-camembert-base | 0.5456 | |
| | | EuropeanParliament/eubert_embedding_v1 | 0.5063 | |
| |
|
| | These results have been obtained through 9 french benchmarks on a variety of text similarity tasks (classification, reranking, STS) : |
| | - AmazonReviewsClassification (MTEB) |
| | - MassiveIntentClassification (MTEB) |
| | - MassiveScenarioClassification (MTEB) |
| | - MTOPDomainClassification (MTEB) |
| | - MTOPIntentClassification (MTEB) |
| | - STS22 (MTEB) |
| | - MiraclFRRerank (Miracl) |
| | - OrdalieFRSTS (Ordalie) |
| | - OrdalieFRReranking (Ordalie) |
| |
|
| | We created OrdalieFRSTS and OrdalieFRReranking to enhance the benchmarking capabilities of French STS and reranking assessments. |
| |
|
| | (evaluation script available here : github.com/OrdalieTech/mteb) |