| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # DeTexD-RoBERTa-base delicate text detection |
| |
|
| | This is a baseline RoBERTa-base model for the delicate text detection task. |
| |
|
| | * Paper: [DeTexD: A Benchmark Dataset for Delicate Text Detection](TODO) |
| | * [GitHub repository](https://github.com/grammarly/detexd) |
| |
|
| | ## Classification example code |
| |
|
| | Here's a short usage example with the torch library in a binary classification task: |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | classifier = pipeline("text-classification", model="grammarly/detexd-roberta-base") |
| | |
| | def predict_binary_score(text: str): |
| | # get multiclass probability scores |
| | scores = classifier(text, top_k=None) |
| | |
| | # convert to a single score by summing the probability scores |
| | # for the higher-index classes |
| | return sum(score['score'] |
| | for score in scores |
| | if score['label'] in ('LABEL_3', 'LABEL_4', 'LABEL_5')) |
| | |
| | def predict_delicate(text: str, threshold=0.72496545): |
| | return predict_binary_score(text) > threshold |
| | |
| | print(predict_delicate("Time flies like an arrow. Fruit flies like a banana.")) |
| | ``` |
| |
|
| | Expected output: |
| |
|
| | ``` |
| | False |
| | ``` |
| |
|
| | ## BibTeX entry and citation info |
| |
|
| | Please cite [our paper](TODO) if you use this model. |
| |
|
| | ```bibtex |
| | TODO |
| | ``` |