antoinelouis
/

biencoder-camembert-base-mmarcoFR

@@ -38,8 +38,6 @@ embeddings = model.encode(sentences)
 print(embeddings)
 ```
 #### 🤗 Transformers
 Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
@@ -77,24 +75,21 @@ print("Sentence embeddings:")
 print(sentence_embeddings)
 ```
 ## Evaluation
 ***
 We evaluated our model on the smaller development set of mMARCO-fr, which consists of 6,980 queries for a corpus of 8.8M candidate passages. Below, we compared the model performance with other biencoder models fine-tuned on the same dataset. We report the mean reciprocal rank (MRR), normalized discounted cumulative gainand (NDCG), mean average precision (MAP), and recall at various cut-offs (R@k).
-|    | model                                                                                                                                                                            |  Size |   MRR@10 |   NDCG@10 |   MAP@10 |   R@10 |   R@100(↑) |   R@500 |
-|---:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------:|---------:|----------:|---------:|-------:|-----------:|--------:|
-|  1 | **biencoder-camembert-base-mmarcoFR**                                                                                                                                            | 443MB |    28.53 |     33.72 |    27.93 |  51.46 |      77.82 |   89.13 |
-|  2 | [biencoder-all-mpnet-base-v2-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-all-mpnet-base-v2-mmarcoFR)                                                                 | 438MB |    28.04 |     33.28 |    27.5  |  51.07 |      77.68 |   88.67 |
-|  3 | [biencoder-sentence-camembert-base-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-sentence-camembert-base-mmarcoFR)                                                     | 443MB |    27.63 |     32.7  |    27.01 |  50.10 |      76.85 |   88.73 |
-|  4 | [biencoder-distilcamembert-base-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-distilcamembert-base-mmarcoFR)                                                           | 272MB |    26.80 |     31.87 |    26.23 |  49.20 |      76.44 |   87.87 |
-|  5 | [biencoder-mMiniLMv2-L12-H384-distilled-from-XLMR-Large-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mMiniLMv2-L12-H384-distilled-from-XLMR-Large-mmarcoFR)           | 471MB |    24.74 |     29.41 |    24.23 |  45.40 |      71.52 |   84.42 |
-|  6 | [biencoder-camemberta-base-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-camemberta-base-mmarcoFR)                                                                     | 447MB |    24.78 |     29.24 |    24.23 |  44.58 |      69.59 |   82.18 |
-|  7 | [biencoder-electra-base-french-europeana-cased-discriminator-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-electra-base-french-europeana-cased-discriminator-mmarcoFR) | 440MB |    23.38 |     27.97 |    22.91 |  43.50 |      68.96 |   81.61 |
-|  8 | [biencoder-mMiniLM-L6-v2-mmarco-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mMiniLM-L6-v2-mmarco-mmarcoFR)                                                           | 428MB |    22.87 |     27.26 |    22.37 |  42.3  |      68.78 |   81.39 |
-|  9 | [biencoder-mMiniLMv2-L6-H384-distilled-from-XLMR-Large-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mMiniLMv2-L6-H384-distilled-from-XLMR-Large-mmarcoFR)             | 428MB |    22.29 |     26.57 |    21.8  |  41.25 |      66.78 |   79.83 |
 ## Training
 ***
@@ -116,8 +111,6 @@ We used the French version of the [mMARCO](https://huggingface.co/datasets/unica
 - a smaller dev set of 6,980 queries (which is actually used for evaluation in most published works).
 Link: [https://ir-datasets.com/mmarco.html#mmarco/v2/fr/](https://ir-datasets.com/mmarco.html#mmarco/v2/fr/)
 ## Citation
 ```bibtex

 print(embeddings)
 ```
 #### 🤗 Transformers
 Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
 print(sentence_embeddings)
 ```
 ## Evaluation
 ***
 We evaluated our model on the smaller development set of mMARCO-fr, which consists of 6,980 queries for a corpus of 8.8M candidate passages. Below, we compared the model performance with other biencoder models fine-tuned on the same dataset. We report the mean reciprocal rank (MRR), normalized discounted cumulative gainand (NDCG), mean average precision (MAP), and recall at various cut-offs (R@k).
+|    | model                                                                                                                   | Vocab. | #Param. |  Size |   MRR@10 |   NDCG@10 |   MAP@10 |   R@10 |   R@100(↑) |   R@500 |
+|---:|:------------------------------------------------------------------------------------------------------------------------|:-------|--------:|------:|---------:|----------:|---------:|-------:|-----------:|--------:|
+|  1 | **biencoder-camembert-base-mmarcoFR**                                                                                   |     🇫🇷 |    110M | 443MB |    28.53 |     33.72 |    27.93 |  51.46 |      77.82 |   89.13 |
+|  2 | [biencoder-mpnet-base-all-v2-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mpnet-base-all-v2-mmarcoFR)        |     🇬🇧 |    109M | 438MB |    28.04 |     33.28 |    27.50 |  51.07 |      77.68 |   88.67 |
+|  3 | [biencoder-distilcamembert-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-distilcamembert-mmarcoFR)            |     🇫🇷 |     68M | 272MB |    26.80 |     31.87 |    26.23 |  49.20 |      76.44 |   87.87 |
+|  4 | [biencoder-MiniLM-L6-all-v2-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-MiniLM-L6-all-v2-mmarcoFR)          |     🇬🇧 |     23M |  91MB |    25.49 |     30.39 |    24.99 |  47.10 |      73.48 |   86.09 |
+|  5 | [biencoder-mMiniLMv2-L12-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mMiniLMv2-L12-mmarcoFR)                | 🇫🇷,99+ |    117M | 471MB |    24.74 |     29.41 |    24.23 |  45.40 |      71.52 |   84.42 |
+|  6 | [biencoder-camemberta-base-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-camemberta-base-mmarcoFR)            |     🇫🇷 |    112M | 447MB |    24.78 |     29.24 |    24.23 |  44.58 |      69.59 |   82.18 |
+|  7 | [biencoder-electra-base-french-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-electra-base-french-mmarcoFR)    |     🇫🇷 |    110M | 440MB |    23.38 |     27.97 |    22.91 |  43.50 |      68.96 |   81.61 |
+|  8 | [biencoder-mMiniLMv2-L6-mmarcoFR](https://huggingface.co/antoinelouis/biencoder-mMiniLMv2-L6-mmarcoFR)                  | 🇫🇷,99+ |    107M | 428MB |    22.29 |     26.57 |    21.80 |  41.25 |      66.78 |   79.83 |
 ## Training
 ***
 - a smaller dev set of 6,980 queries (which is actually used for evaluation in most published works).
 Link: [https://ir-datasets.com/mmarco.html#mmarco/v2/fr/](https://ir-datasets.com/mmarco.html#mmarco/v2/fr/)
 ## Citation
 ```bibtex