Add comprehensive model card for E2Rank

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +139 -0
README.md ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: feature-extraction
5
+ ---
6
+
7
+ # E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker
8
+
9
+ We introduce $\textrm{E}^2\text{Rank}$, meaning **E**fficient **E**mbedding-based **Rank**ing (also meaning **Embedding-to-Rank**), which extends a single text embedding model to perform both high-quality retrieval and listwise reranking, thereby achieving strong effectiveness with remarkable efficiency.
10
+
11
+ This model is presented in the paper: [$\text{E}^2\text{Rank}$: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker](https://huggingface.co/papers/2510.22733).
12
+
13
+ **Project Page**: https://alibaba-nlp.github.io/E2Rank/
14
+ **Code**: https://github.com/Alibaba-NLP/E2Rank
15
+
16
+ <div align="center">
17
+ <img src="https://github.com/Alibaba-NLP/E2Rank/raw/main/assets/cover.png" width="90%" height="auto" />
18
+ <p style="width: 70%; margin-left: auto; margin-right: auto">
19
+ <b>(a)</b> Overview of E2Rank. <b>(b)</b> Average reranking performance on the BEIR benchmark, E2Rank outperforms other baselines. <b>(c)</b> Reranking latency per query on the Covid dataset, E2Rank can achieve several times the acceleration compared with RankQwen3.
20
+ </p>
21
+ </div>
22
+
23
+ ## Introduction
24
+
25
+ We introduce $\textrm{E}^2\text{Rank}$,
26
+ meaning **E**fficient **E**mbedding-based **Rank**ing
27
+ (also meaning **Embedding-to-Rank**),
28
+ which extends a single text embedding model
29
+ to perform both high-quality retrieval and listwise reranking,
30
+ thereby achieving strong effectiveness with remarkable efficiency.
31
+
32
+ By applying cosine similarity between the query and
33
+ document embeddings as a unified ranking function, the listwise ranking prompt,
34
+ which is constructed from the original query and its candidate documents, serves
35
+ as an enhanced query enriched with signals from the top-K documents, akin to
36
+ pseudo-relevance feedback (PRF) in traditional retrieval models. This design
37
+ preserves the efficiency and representational quality of the base embedding model
38
+ while significantly improving its reranking performance.
39
+
40
+ Empirically, E2Rank achieves state-of-the-art results on the BEIR reranking benchmark
41
+ and demonstrates competitive performance on the reasoning-intensive BRIGHT benchmark,
42
+ with very low reranking latency. We also show that the ranking training process
43
+ improves embedding performance on the MTEB benchmark.
44
+ Our findings indicate that a single embedding model can effectively unify retrieval and reranking,
45
+ offering both computational efficiency and competitive ranking accuracy.
46
+
47
+ Our work highlights the potential of single embedding models to serve as unified retrieval-reranking engines, offering a practical, efficient, and accurate alternative to complex multi-stage ranking systems.
48
+
49
+ ## Abstract
50
+
51
+ Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval performance with high efficiency. However, their ranking fidelity remains limited compared to dedicated rerankers, especially recent LLM-based listwise rerankers, which capture fine-grained query-document and document-document interactions. In this paper, we propose a simple yet effective unified framework $\text{E}^2\text{Rank}$, means Efficient Embedding-based Ranking (also means Embedding-to-Rank), which extends a single text embedding model to perform both high-quality retrieval and listwise reranking through continued training under a listwise ranking objective, thereby achieving strong effectiveness with remarkable efficiency. By applying cosine similarity between the query and document embeddings as a unified ranking function, the listwise ranking prompt, which is constructed from the original query and its candidate documents, serves as an enhanced query enriched with signals from the top-K documents, akin to pseudo-relevance feedback (PRF) in traditional retrieval models. This design preserves the efficiency and representational quality of the base embedding model while significantly improving its reranking performance. Empirically, $\textrm{E}^2\text{Rank}$ achieves state-of-the-art results on the BEIR reranking benchmark and demonstrates competitive performance on the reasoning-intensive BRIGHT benchmark, with very low reranking latency. We also show that the ranking training process improves embedding performance on the MTEB benchmark. Our findings indicate that a single embedding model can effectively unify retrieval and reranking, offering both computational efficiency and competitive ranking accuracy.
52
+
53
+ ## Usage
54
+
55
+ ### Embedding Model
56
+
57
+ The usage of E2Rank as an embedding model is similar to [Qwen3-Embedding](https://github.com/QwenLM/Qwen3-Embedding). The only difference is that Qwen3-Embedding will automatically append an EOS token, while E2Rank requires users to manually append the special token `<|endoftext|>` at the end of each input text.
58
+
59
+ The following code demonstrates how to use `Alibaba-NLP/E2Rank-0.6B` (or other E2Rank models) with the Hugging Face `transformers` library to obtain embeddings.
60
+
61
+ ```python
62
+ # Requires transformers>=4.51.0
63
+ import torch
64
+ import torch.nn.functional as F
65
+
66
+ from torch import Tensor
67
+ from transformers import AutoTokenizer, AutoModel
68
+
69
+
70
+ def last_token_pool(last_hidden_states: Tensor, attention_mask: Tensor) -> Tensor:
71
+ left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
72
+ if left_padding:
73
+ return last_hidden_states[:, -1]
74
+ else:
75
+ sequence_lengths = attention_mask.sum(dim=1) - 1
76
+ batch_size = last_hidden_states.shape[0]
77
+ return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
78
+
79
+
80
+ def get_detailed_instruct(task_description: str, query: str) -> str:
81
+ return f'Instruct: {task_description}\
82
+ Query:{query}'
83
+
84
+ # Each query must come with a one-sentence instruction that describes the task
85
+ task = 'Given a web search query, retrieve relevant passages that answer the query'
86
+
87
+ queries = [
88
+ get_detailed_instruct(task, 'What is the capital of China?'),
89
+ get_detailed_instruct(task, 'Explain gravity')
90
+ ]
91
+ # No need to add instruction for retrieval documents
92
+ documents = [
93
+ "The capital of China is Beijing.",
94
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
95
+ ]
96
+ input_texts = queries + documents
97
+ input_texts = [t + "<|endoftext|>" for t in input_texts]
98
+
99
+ tokenizer = AutoTokenizer.from_pretrained('Alibaba-NLP/E2Rank-0.6B', padding_side='left')
100
+ model = AutoModel.from_pretrained('Alibaba-NLP/E2Rank-0.6B')
101
+
102
+ max_length = 8192
103
+
104
+ # Tokenize the input texts
105
+ batch_dict = tokenizer(
106
+ input_texts,
107
+ padding=True,
108
+ truncation=True,
109
+ max_length=max_length,
110
+ return_tensors="pt",
111
+ )
112
+ batch_dict.to(model.device)
113
+ with torch.no_grad():
114
+ outputs = model(**batch_dict)
115
+ embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
116
+
117
+ # normalize embeddings
118
+ embeddings = F.normalize(embeddings, p=2, dim=1)
119
+ scores = (embeddings[:2] @ embeddings[2:].T)
120
+
121
+ print(scores.tolist())
122
+ # [[0.5950675010681152, 0.030417663976550102], [0.061970409005880356, 0.562691330909729]]
123
+ ```
124
+
125
+ ## Citation
126
+
127
+ If this work is helpful, please kindly cite as:
128
+
129
+ ```bibtext
130
+ @misc{liu2025e2rank,
131
+ title={E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker},
132
+ author={Qi Liu and Yanzhao Zhang and Mingxin Li and Dingkun Long and Pengjun Xie and Jiaxin Mao},
133
+ year={2025},
134
+ eprint={2510.22733},
135
+ archivePrefix={arXiv},
136
+ primaryClass={cs.CL},
137
+ url={https://arxiv.org/abs/2510.22733},
138
+ }
139
+ ```