rvo commited on
Commit
cdaa9fa
·
verified ·
1 Parent(s): 4ae24fd

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -24,8 +24,9 @@ language:
24
  1. [Introduction](#introduction)
25
  2. [Technical Report](#technical-report)
26
  3. [Highlights](#highlights)
27
- 4. [Quickstart](#quickstart)
28
- 5. [Citation](#citation)
 
29
 
30
  # Introduction
31
 
@@ -48,12 +49,15 @@ A technical report detailing our proposed `LEAF` training procedure will be avai
48
  * **Flexible Architecture Support**: `mdbr-leaf-ir` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
49
  * **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-ir` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`. [See below](#mrl-truncation) for more information.
50
 
51
- ## Benchmark Comparison
52
 
53
  The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
54
 
 
 
55
  | Model | Size | BEIR Avg. (nDCG@10) |
56
  |------------------------------------|------|----------------------|
 
57
  | **mdbr-leaf-ir** | 23M | **53.55** |
58
  | snowflake-arctic-embed-s | 32M | 51.98 |
59
  | bge-small-en-v1.5 | 33M | 51.65 |
@@ -64,7 +68,7 @@ The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf
64
  | MiniLM-L6-v2 | 23M | 41.95 |
65
  | BM25 | – | 41.14 |
66
 
67
- [//]: # (| **mdbr-leaf-ir (asym.)** | 23M | **?** | )
68
 
69
 
70
  # Quickstart
@@ -114,7 +118,7 @@ for i, query in enumerate(queries):
114
 
115
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
116
 
117
- ## Asymmetric Retrieval Setup
118
 
119
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
120
  ```python
 
24
  1. [Introduction](#introduction)
25
  2. [Technical Report](#technical-report)
26
  3. [Highlights](#highlights)
27
+ 4. [Benchmarks](#benchmark-comparison)
28
+ 5. [Quickstart](#quickstart)
29
+ 6. [Citation](#citation)
30
 
31
  # Introduction
32
 
 
49
  * **Flexible Architecture Support**: `mdbr-leaf-ir` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
50
  * **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-ir` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`. [See below](#mrl-truncation) for more information.
51
 
52
+ ## Benchmark Comparison
53
 
54
  The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
55
 
56
+ `mdbr-leaf-ir` ranks #1 on the BEIR public leaderboard, and when run in asymmetric "**(asym.)**" mode as described [here](#asymmetric-retrieval-setup), the results improve even further.
57
+
58
  | Model | Size | BEIR Avg. (nDCG@10) |
59
  |------------------------------------|------|----------------------|
60
+ | **mdbr-leaf-ir (asym.)** | 23M | **54.03** |
61
  | **mdbr-leaf-ir** | 23M | **53.55** |
62
  | snowflake-arctic-embed-s | 32M | 51.98 |
63
  | bge-small-en-v1.5 | 33M | 51.65 |
 
68
  | MiniLM-L6-v2 | 23M | 41.95 |
69
  | BM25 | – | 41.14 |
70
 
71
+
72
 
73
 
74
  # Quickstart
 
118
 
119
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
120
 
121
+ ## Asymmetric Retrieval Setup
122
 
123
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
124
  ```python