You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

newlantern-deberta-sk

DeBERTa-v3-large fine-tuned for radiology prior study relevance classification. Given a current and a prior radiology study description, predicts whether the prior is relevant to the current read.

Built for the New Lantern challenge: relevant-priors-v1.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("sh3hryarkhan/newlantern-deberta-sk")
model = AutoModelForSequenceClassification.from_pretrained("sh3hryarkhan/newlantern-deberta-sk")
model.eval()

text_a = "delta: 365d (<= 1y) | cur: CT HEAD WITHOUT CONTRAST | norm: CT/HEAD/BILATERAL/WITHOUT"
text_b = "prior: MRI BRAIN ROUTINE | norm: MRI/HEAD/BILATERAL/WITHOUT"

enc = tokenizer(text_a, text_b, truncation=True, max_length=128, return_tensors="pt")
with torch.no_grad():
    prob = torch.softmax(model(**enc).logits, dim=-1)[0, 1].item()

print(f"P(relevant) = {prob:.3f}")

Label 1 = relevant, 0 = not relevant. Decision threshold: 0.5.

Input Format

text_a: delta: {days}d ({bucket}) | cur: {current_description} | norm: {mod}/{region}/{lat}/{con}
text_b: prior: {prior_description} | norm: {mod}/{region}/{lat}/{con}

The norm field is a 4-tuple (modality, region, laterality, contrast) parsed from the raw description string. The delta bucket maps days to one of: same day, <= 1m, <= 3m, <= 6m, <= 1y, <= 2y, <= 5y, > 5y.

Training

Base model: microsoft/deberta-v3-large
Task: binary sequence classification (relevant / not relevant)
Data: public split of the New Lantern relevant-priors-v1 challenge (~13k labeled pairs)
Epochs: 4 (early stopped; epoch 5 overfit)
Phase 1 (epochs 1-2): standard cross-entropy training; AdamW lr=2e-5, cosine schedule, fp16
Phase 2 (epochs 3-4): hard-negative mining via WeightedRandomSampler (3x weight on misclassified samples)
Hardware: A100
Best val accuracy: 96.44%

System Context

This model is Tier 2 in a three-tier cascade. Tier 1 is a lookup table over seen (cur, prior) description pairs with Laplace smoothing; the encoder only fires when the lookup abstains (novel pairs). In production on the full public split, ~5-10% of pairs reach the encoder. See the full system at sh3hryarkhan/newlantern-deberta-sk.

Downloads last month: 17

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sh3hryarkhan/newlantern-deberta-sk

Base model

microsoft/deberta-v3-large

Finetuned

(265)

this model