mdts-circuit-full-bm25
Circuit-Full fine-tuned cross-encoder with BM25 hard negatives. Attention + MLP in IE-ranked layers 8-11 (4.73M params). Strategy D from MDTS circuit fine-tuning experiments on SciFact. Best efficiency score (0.0057 NDCG per million params).
Base Model
cross-encoder/ms-marco-MiniLM-L-12-v2
Training Data
SciFact (BEIR benchmark) with BM25 hard negatives.
Results on SciFact (NDCG@10)
| Strategy | Params | NDCG@10 | Delta |
|---|---|---|---|
| A: Circuit MLP-only | 2.36M | 0.6545 | +0.0110 |
| B: Last-4 Layers | 7.10M | 0.6686 | +0.0251 |
| C: Full Fine-Tuning | 33.36M | 0.6879 | +0.0444 |
| D: Circuit-Full (BM25) | 4.73M | 0.6707 | +0.0272 |
| E: Circuit-Full (Mixed) | 4.73M | 0.6622 | +0.0187 |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("JAYADIR/mdts-circuit-full-bm25")
model = AutoModelForSequenceClassification.from_pretrained("JAYADIR/mdts-circuit-full-bm25")
query = "What fertilizer is best for wheat?"
passage = "Wheat requires nitrogen-rich fertilizer during early growth stages."
inputs = tokenizer(query, passage, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score:.4f}")
- Downloads last month
- 42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for JAYADIR/mdts-circuit-full-bm25
Base model
microsoft/MiniLM-L12-H384-uncased
Quantized
cross-encoder/ms-marco-MiniLM-L12-v2