search-reranker-broad-v1
search-reranker-broad-v1 is a broad fine-tune of cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 for search reranking.
It is intended for reranking a small candidate set of query-document pairs after retrieval. The repo contains both the fine-tuned root checkpoint and a standard ONNX Runtime CPU artifact at onnx/model.onnx.
Artifacts
- Root files: Sentence Transformers / Transformers checkpoint for direct use with
CrossEncoderorAutoModelForSequenceClassification onnx/model.onnx: dynamic int8 ONNX Runtime export using per-channel weight quantization- Tokenizer files at repo root shared by both runtimes
Training
- Base model:
cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 - Training rows:
1,837 - Epochs:
2 - Batch size:
16 - Learning rate:
2e-5 - Max length:
128
The fine-tune used a broad weakly supervised reranking dataset built from benchmark queries and competing candidate pages over public web content.
Validation
Local validation on 11 March 2026:
| Slice | Runtime | Top-1 | Hit@5 | MRR |
|---|---|---|---|---|
| Broad weak-label eval | fp32 checkpoint | 0.7882 | 1.0000 | 0.8769 |
| Broad weak-label eval | ONNX qint8 per-channel | 0.7765 | 1.0000 | 0.8704 |
| Office-holder canary | fp32 checkpoint | 0.8333 | 1.0000 | 0.9028 |
| Office-holder canary | ONNX qint8 per-channel | 0.8333 | 1.0000 | 0.9074 |
The ONNX artifact was selected from a small dynamic-int8 profile sweep because it had the best quality retention among the tested q8 variants.
Dataset
The training dataset is not published with this model yet.
Reason:
- the supervision was derived from an internal benchmark workflow
- it still needs additional curation before it is suitable as a standalone public dataset release
Usage
Sentence Transformers
from sentence_transformers import CrossEncoder
model = CrossEncoder("temsa/search-reranker-broad-v1")
scores = model.predict([
["who is the taoiseach?", "Micheal Martin is the Taoiseach."],
["who is the taoiseach?", "Former Taoisigh list..."],
])
print(scores)
Transformers
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("temsa/search-reranker-broad-v1")
model = AutoModelForSequenceClassification.from_pretrained("temsa/search-reranker-broad-v1")
ONNX Runtime
from huggingface_hub import hf_hub_download
import onnxruntime as ort
model_path = hf_hub_download("temsa/search-reranker-broad-v1", "onnx/model.onnx")
session = ort.InferenceSession(model_path, providers=["CPUExecutionProvider"])
print(session.get_providers())
Limitations
- The broad training signal is weakly supervised, not fully manually judged.
- This model is a candidate general reranker, not a guaranteed drop-in quality win for every domain.
- Validate against your own canaries before replacing an existing reranker globally.
- Downloads last month
- 288
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support