grammarly
/

detexd-roberta-base

Text Classification

Model card Files Files and versions

syavnyi commited on Mar 30, 2023

Commit

a0e06f3

·

1 Parent(s): d7abbfc

Model card WIP

Files changed (1) hide show

README.md +54 -1

README.md CHANGED Viewed

@@ -3,4 +3,57 @@ license: apache-2.0
 language:
 - en
 pipeline_tag: text-classification
----

 language:
 - en
 pipeline_tag: text-classification
+datasets:
+- grammarly/detexd-eval
+---
+# DeTexD-RoBERTa-base delicate text detection
+This is a baseline RoBERTa-base model for the delicate text detection task.
+* Paper: [DeTexD: A Benchmark Dataset for Delicate Text Detection](TODO)
+* [GitHub repository](https://github.com/grammarly/detexd)
+## Classification example code
+Here's a short usage example with the torch library in a binary classification task:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("grammarly/detexd-roberta")
+model = AutoModelForSequenceClassification.from_pretrained("grammarly/detexd-roberta")
+model.eval()
+def predict_binary_score(text: str, break_class_ix=3):
+    with torch.no_grad():
+        # get multiclass probability scores
+        logits = model(**tokenizer(text, return_tensors='pt'))[0]
+        probs = torch.nn.functional.softmax(logits, dim=-1)
+        # convert to a binary prediction by summing the probability scores
+        # for the higher-index classes, as defined by break_class_ix
+        bin_score = probs[..., break_class_ix:].sum(dim=-1)
+        return bin_score.item()
+def predict_delicate(text: str, threshold=0.72496545):
+    return predict_binary_score(text) > threshold
+print(predict_delicate("Time flies like an arrow. Fruit flies like a banana."))
+```
+Expected output:
+```
+False
+```
+## BibTeX entry and citation info
+Please cite [our paper](TODO) if you use this model.
+```bibtex
+TODO
+```