How to use from the
Use from the
sentence-transformers library
from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/monoelectra-base", trust_remote_code=True)

query = "Which planet is known as the Red Planet?"
passages = [
	"Venus is often called Earth's twin because of its similar size and proximity.",
	"Mars, known for its reddish appearance, is often referred to as the Red Planet.",
	"Jupiter, the largest planet in our solar system, has a prominent red spot.",
	"Saturn, famous for its rings, is sometimes mistaken for the Red Planet."
]

scores = model.predict([(query, passage) for passage in passages])
print(scores)

Cross-Encoder for Text Ranking

This model is a port of the webis/monoelectra-base model from lightning-ir to Sentence Transformers and Transformers.

The original model was introduced in the paper A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking. See https://github.com/webis-de/rank-distillm for code used to train the original model.

The model can be used as a reranker in a 2-stage "retrieve-rerank" pipeline, where it reorders passages returned by a retriever model (e.g. an embedding model or BM25) given some query. See SBERT.net Retrieve & Re-rank for more details.

Usage with Sentence Transformers

The usage is easy when you have SentenceTransformers installed.

pip install sentence-transformers

Then you can use the pre-trained model like this:

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/monoelectra-base", trust_remote_code=True)
scores = model.predict([
    ("How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."),
    ("How many people live in Berlin?", "Berlin is well known for its museums."),
])
print(scores)
# [ 8.122868 -4.292924]

Usage with Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("cross-encoder/monoelectra-base", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("cross-encoder/monoelectra-base")

features = tokenizer(
    [
        ("How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."),
        ("How many people live in Berlin?", "Berlin is well known for its museums."),
    ],
    padding=True,
    truncation=True,
    return_tensors="pt",
)

model.eval()
with torch.no_grad():
    scores = model(**features).logits.view(-1)
print(scores)
# tensor([ 8.1229, -4.2929])
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cross-encoder/monoelectra-base

Quantized
(4)
this model

Paper for cross-encoder/monoelectra-base