reproducing-cross-encoders
Collection
A set of cross-encoders trained from various backbones and losses for equal comparison • 55 items • Updated
• 3
This model is a cross-encoder based on google/electra-base-discriminator. It was trained on Ms-Marco using loss marginMSE as part of a reproducibility paper for training cross encoders: "Reproducing and Comparing Distillation Techniques for Cross-Encoders", see the paper for more details.
This model is intended for re-ranking the top results returned by a retrieval system (like BM25, Bi-Encoders or SPLADE).
Training can be easily reproduced using the assiciated repository. The exact training configuration used for this model is also detailed in config.yaml.
Quick Start:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("google/electra-base-discriminator")
model = AutoModelForSequenceClassification.from_pretrained("xpmir/cross-encoder-ELECTRA-MarginMSE")
features = tokenizer("What is experimaestro ?", "Experimaestro is a powerful framework for ML experiments management...", padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
scores = model(**features).logits
print(scores)
We provide evaluations of this cross-encoder re-ranking the top 1000 documents retrieved by naver/splade-v3-distilbert.
| dataset | RR@10 | nDCG@10 |
|---|---|---|
| msmarco_dev | 40.82 | 47.45 |
| trec2019 | 94.91 | 73.72 |
| trec2020 | 94.86 | 74.64 |
| fever | 82.48 | 82.13 |
| arguana | 23.08 | 34.45 |
| climate_fever | 33.20 | 24.71 |
| dbpedia | 79.27 | 47.89 |
| fiqa | 47.44 | 39.70 |
| hotpotqa | 90.02 | 74.25 |
| nfcorpus | 57.98 | 35.19 |
| nq | 56.12 | 61.00 |
| quora | 79.66 | 81.97 |
| scidocs | 29.04 | 16.23 |
| scifact | 68.35 | 70.53 |
| touche | 60.35 | 35.11 |
| trec_covid | 90.59 | 67.41 |
| robust04 | 75.11 | 51.63 |
| lotte_writing | 72.25 | 63.10 |
| lotte_recreation | 63.83 | 57.79 |
| lotte_science | 48.95 | 40.69 |
| lotte_technology | 56.58 | 47.75 |
| lotte_lifestyle | 74.91 | 65.15 |
| Mean In Domain | 76.86 | 65.27 |
| BEIR 13 | 61.35 | 51.58 |
| LoTTE (OOD) | 65.27 | 54.35 |
Base model
google/electra-base-discriminator