thejoephase
/

crosstalk

Text Classification

Model card Files Files and versions

Joe Cooper commited on Oct 13, 2025

Commit

83535a6

·

1 Parent(s): 5da31cb

updated model card

Files changed (1) hide show

README.md +39 -0

README.md CHANGED Viewed

@@ -3,6 +3,11 @@ language: en
 tags:
   - deberta
   - deberta-v3
 license: mit
 ---
@@ -52,6 +57,40 @@ concatenated together using a `|` pipe (token id `1540`), like so:
 Output is a single logit.
 ## Training
 Model was trained for 40 hours on a single Nvidia 3090, on 130m tokens of content from Open Subtitles with some pruning and processing of the data. The model scores 92% on the test set, derived from content unrelated to the training set.

 tags:
   - deberta
   - deberta-v3
+  - text-classification
+  - conversational
+  - dialogue
+  - reranking
+pipeline_tag: text-classification
 license: mit
 ---
 Output is a single logit.
+Line speaker is _not_ labeled, and there is no "assistant" / "user" distinction.
+The text under consideration is the outgoing text, and the last line of context
+is the proximate stimulus.
+## Usage
+Model is loaded exactly like DeBERTa-v3, but with model id `thejoephase/crosstalk`.
+```python
+from transformers import AutoTokenizer, DebertaV2ForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("thejoephase/crosstalk", use_fast=False)
+model = DebertaV2ForSequenceClassification.from_pretrained(
+    "thejoephase/crosstalk",
+    num_labels=1)
+history = [
+    "I'm out of coffee.",
+    "What will you do about it?",
+    "I guess I'll buy more."
+]
+context = '|'.join(history)
+candidate = "Is it expensive?"
+inputs = tokenizer(context, candidate, return_tensors="pt")
+with torch.no_grad():
+    logit = model(**inputs).logits.item()
+    score = torch.sigmoid(torch.tensor(logit)).item()
+print(f"Score: {score:.4f}")  # Higher = better fit
+```
 ## Training
 Model was trained for 40 hours on a single Nvidia 3090, on 130m tokens of content from Open Subtitles with some pruning and processing of the data. The model scores 92% on the test set, derived from content unrelated to the training set.