KingTechnician/yahoo-answers-osmosis-labeled
Viewer • Updated • 5k • 46
DeBERTa-v3 cross-encoder for binary response-sufficiency classification.
Given (objective, response), predicts ADDR (response addresses the objective)
or NOADDR (response does not).
First stage of a cascaded pipeline. Confident binary predictions are used directly; low-confidence cases should route to an LLM judge for fine-grained classification.
Trained on Sonnet-4.6-generated labels (flat prompt, echo-stripped responses), validated against 254 human-reviewed labels for deployment-grade evaluation.
Accuracy: 0.806 | Macro F1: 0.669
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| ADDR | 0.857 | 0.909 | 0.882 | 798 |
| NOADDR | 0.526 | 0.401 | 0.455 | 202 |
Accuracy: 0.995 | Macro F1: 0.994
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| ADDR | 1.000 | 0.987 | 0.993 | 150 |
| NOADDR | 0.991 | 1.000 | 0.996 | 223 |
Accuracy: 0.776 | Macro F1: 0.680
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| ADDR | 0.824 | 0.889 | 0.855 | 189 |
| NOADDR | 0.580 | 0.446 | 0.504 | 65 |
MoritzLaurer/deberta-v3-base-zeroshot-v2.0 (NLI-pretrained)from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("KingTechnician/osmosis-crossencoder-binary")
tokenizer = AutoTokenizer.from_pretrained("KingTechnician/osmosis-crossencoder-binary")
inputs = tokenizer("What causes rain?",
"Rain forms when water vapor condenses into droplets.",
return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
pred = logits.argmax(dim=-1).item()
print(["ADDR", "NOADDR"][pred])
Base model
microsoft/deberta-v3-base