mdts-last4-layers / README.md
JAYADIR's picture
Add model card
6c643f9 verified
metadata
language: en
license: apache-2.0
tags:
  - cross-encoder
  - reranking
  - information-retrieval
  - circuit-fine-tuning
  - beir
base_model: cross-encoder/ms-marco-MiniLM-L-12-v2

mdts-last4-layers

Last-4 layers fine-tuned cross-encoder. Full attention + MLP unfrozen for layers 8-11 (7.10M params). Strategy B from MDTS circuit fine-tuning experiments on SciFact.

Base Model

cross-encoder/ms-marco-MiniLM-L-12-v2

Training Data

SciFact (BEIR benchmark) with BM25 hard negatives.

Results on SciFact (NDCG@10)

Strategy Params NDCG@10 Delta
A: Circuit MLP-only 2.36M 0.6545 +0.0110
B: Last-4 Layers 7.10M 0.6686 +0.0251
C: Full Fine-Tuning 33.36M 0.6879 +0.0444
D: Circuit-Full (BM25) 4.73M 0.6707 +0.0272
E: Circuit-Full (Mixed) 4.73M 0.6622 +0.0187

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("JAYADIR/mdts-last4-layers")
model = AutoModelForSequenceClassification.from_pretrained("JAYADIR/mdts-last4-layers")

query = "What fertilizer is best for wheat?"
passage = "Wheat requires nitrogen-rich fertilizer during early growth stages."

inputs = tokenizer(query, passage, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score:.4f}")