Joe Cooper commited on
Commit
83535a6
·
1 Parent(s): 5da31cb

updated model card

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -3,6 +3,11 @@ language: en
3
  tags:
4
  - deberta
5
  - deberta-v3
 
 
 
 
 
6
  license: mit
7
  ---
8
 
@@ -52,6 +57,40 @@ concatenated together using a `|` pipe (token id `1540`), like so:
52
 
53
  Output is a single logit.
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  ## Training
56
 
57
  Model was trained for 40 hours on a single Nvidia 3090, on 130m tokens of content from Open Subtitles with some pruning and processing of the data. The model scores 92% on the test set, derived from content unrelated to the training set.
 
3
  tags:
4
  - deberta
5
  - deberta-v3
6
+ - text-classification
7
+ - conversational
8
+ - dialogue
9
+ - reranking
10
+ pipeline_tag: text-classification
11
  license: mit
12
  ---
13
 
 
57
 
58
  Output is a single logit.
59
 
60
+ Line speaker is _not_ labeled, and there is no "assistant" / "user" distinction.
61
+ The text under consideration is the outgoing text, and the last line of context
62
+ is the proximate stimulus.
63
+
64
+ ## Usage
65
+
66
+ Model is loaded exactly like DeBERTa-v3, but with model id `thejoephase/crosstalk`.
67
+
68
+ ```python
69
+ from transformers import AutoTokenizer, DebertaV2ForSequenceClassification
70
+ import torch
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained("thejoephase/crosstalk", use_fast=False)
73
+ model = DebertaV2ForSequenceClassification.from_pretrained(
74
+ "thejoephase/crosstalk",
75
+ num_labels=1)
76
+
77
+ history = [
78
+ "I'm out of coffee.",
79
+ "What will you do about it?",
80
+ "I guess I'll buy more."
81
+ ]
82
+ context = '|'.join(history)
83
+ candidate = "Is it expensive?"
84
+
85
+ inputs = tokenizer(context, candidate, return_tensors="pt")
86
+
87
+ with torch.no_grad():
88
+ logit = model(**inputs).logits.item()
89
+ score = torch.sigmoid(torch.tensor(logit)).item()
90
+
91
+ print(f"Score: {score:.4f}") # Higher = better fit
92
+ ```
93
+
94
  ## Training
95
 
96
  Model was trained for 40 hours on a single Nvidia 3090, on 130m tokens of content from Open Subtitles with some pruning and processing of the data. The model scores 92% on the test set, derived from content unrelated to the training set.