newlantern-deberta-sk
DeBERTa-v3-large fine-tuned for radiology prior study relevance classification. Given a current and a prior radiology study description, predicts whether the prior is relevant to the current read.
Built for the New Lantern challenge: relevant-priors-v1.
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("sh3hryarkhan/newlantern-deberta-sk")
model = AutoModelForSequenceClassification.from_pretrained("sh3hryarkhan/newlantern-deberta-sk")
model.eval()
text_a = "delta: 365d (<= 1y) | cur: CT HEAD WITHOUT CONTRAST | norm: CT/HEAD/BILATERAL/WITHOUT"
text_b = "prior: MRI BRAIN ROUTINE | norm: MRI/HEAD/BILATERAL/WITHOUT"
enc = tokenizer(text_a, text_b, truncation=True, max_length=128, return_tensors="pt")
with torch.no_grad():
prob = torch.softmax(model(**enc).logits, dim=-1)[0, 1].item()
print(f"P(relevant) = {prob:.3f}")
Label 1 = relevant, 0 = not relevant. Decision threshold: 0.5.
Input Format
text_a: delta: {days}d ({bucket}) | cur: {current_description} | norm: {mod}/{region}/{lat}/{con}
text_b: prior: {prior_description} | norm: {mod}/{region}/{lat}/{con}
The norm field is a 4-tuple (modality, region, laterality, contrast) parsed from the raw description string. The delta bucket maps days to one of: same day, <= 1m, <= 3m, <= 6m, <= 1y, <= 2y, <= 5y, > 5y.
Training
- Base model: microsoft/deberta-v3-large
- Task: binary sequence classification (relevant / not relevant)
- Data: public split of the New Lantern relevant-priors-v1 challenge (~13k labeled pairs)
- Epochs: 4 (early stopped; epoch 5 overfit)
- Phase 1 (epochs 1-2): standard cross-entropy training; AdamW lr=2e-5, cosine schedule, fp16
- Phase 2 (epochs 3-4): hard-negative mining via WeightedRandomSampler (3x weight on misclassified samples)
- Hardware: A100
- Best val accuracy: 96.44%
System Context
This model is Tier 2 in a three-tier cascade. Tier 1 is a lookup table over seen (cur, prior) description pairs with Laplace smoothing; the encoder only fires when the lookup abstains (novel pairs). In production on the full public split, ~5-10% of pairs reach the encoder. See the full system at sh3hryarkhan/newlantern-deberta-sk.
- Downloads last month
- 17
Model tree for sh3hryarkhan/newlantern-deberta-sk
Base model
microsoft/deberta-v3-large