|
|
--- |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google/electra-base-discriminator |
|
|
pipeline_tag: text-ranking |
|
|
--- |
|
|
|
|
|
monoELECTRA is a highly effective cross-encoder reranker built on `google/electra-base-discriminator` and trained on MS MARCO passage data for 300K steps with a batch size of 16. |
|
|
It uses hard negatives from strong first-stage retrievers and the Localized Contrastive Estimation (LCE) loss with large group sizes (up to 31 negatives per positive). |
|
|
This setup consistently outperforms standard monoBERT and hinge/CE baselines, especially in the top-$k$ pool where near-duplicate passages matter. |
|
|
If you want a compact, supervised reranker that was tuned to squeeze every last bit of signal from hard negatives, use this one. |
|
|
|
|
|
If you use the monoELECTRA model, please cite the following relevant paper: |
|
|
|
|
|
[Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking](https://arxiv.org/abs/2312.02724) |
|
|
<!-- {% raw %} --> |
|
|
``` |
|
|
@inproceedings{squeezemonoelectra2022, |
|
|
author = {Pradeep, Ronak and Liu, Yuqi and Zhang, Xinyu and Li, Yilin and Yates, Andrew and Lin, Jimmy}, |
|
|
title = {Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking}, |
|
|
year = {2022}, |
|
|
publisher = {Springer-Verlag}, |
|
|
address = {Berlin, Heidelberg}, |
|
|
booktitle = {Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part I}, |
|
|
pages = {655–670}, |
|
|
numpages = {16}, |
|
|
location = {Stavanger, Norway} |
|
|
} |
|
|
``` |
|
|
|