Euphemism Detector (V1 โ English)
An updated multilingual version is available: hasancanbiyik/euphemism-detector-multilingual โ fine-tuned on 7 languages (EN/TR/ZH/ES/YO/PL/UK) with 0.808 macro-F1 and zero-shot transfer to 22 additional languages.
Fine-tuned XLM-RoBERTa-base for euphemism disambiguation on English PETs (Potentially Euphemistic Terms). Given a sentence with a marked phrase, the model predicts whether the phrase is used euphemistically or literally.
This model was fine-tuned with the English PETs dataset created by the NLP Lab at Montclair State University, U.S.A.
Performance (English)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Literal | 0.81 | 0.83 | 0.82 |
| Euphemistic | 0.88 | 0.86 | 0.87 |
| Macro avg | 0.84 | 0.84 | 0.84 |
Usage
The model expects input text with [PET_BOUNDARY] tokens marking the target phrase:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F
tokenizer = AutoTokenizer.from_pretrained("hasancanbiyik/euphemism-detector")
model = AutoModelForSequenceClassification.from_pretrained("hasancanbiyik/euphemism-detector")
model.eval()
text = "My grandmother [PET_BOUNDARY]passed away[PET_BOUNDARY] last Tuesday."
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True)
with torch.no_grad():
probs = F.softmax(model(**inputs).logits, dim=1).squeeze()
print(f"Euphemistic: {probs[1].item():.1%}")
print(f"Literal: {probs[0].item():.1%}")
Updated Version
For multilingual support (7 training languages + zero-shot transfer to 22 additional languages), batch prediction, and improved performance, see the V2 model:
hasancanbiyik/euphemism-detector-multilingual
Research Context
- Biyik, H. C., Lee, P., & Feldman, A. (2024). Turkish Delights: A Dataset on Turkish Euphemisms. SIGTURK at ACL 2024. arXiv:2407.13040
- Biyik, H. C., Barak, L., Peng, J., & Feldman, A. (2026). When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English. SIGTURK at EACL 2026. arXiv:2602.16957
License
MIT
- Downloads last month
- 37