RoBERTa-Contextual-Sarcasm-Hybrid

Model Description

This model is a fine-tuned version of cardiffnlp/twitter-roberta-base-irony optimized for detecting sarcasm in modern narrative dialogue. Unlike standard sentiment-based irony detectors, this model utilizes a Relational Attention mechanism enabled by a [PREVIOUS_CONTEXT] [SEP] [DIALOGUE] input schema.

Training Data & Methodology

The model was trained on a balanced hybrid corpus designed to minimize "classifier paranoia" in modern conversational agents:

  • Contextual JSON (152 samples): Primary high-quality dialogue with situational context.
  • Joshi Snippets (50 samples): Targeted sarcastic signals for Class 1 (Sarcastic) expansion.
  • Gutenberg Anchoring (100 samples): Formal Victorian prose used for Class 0 (Sincere) stabilization.

Performance & Calibration

The model achieves high statistical recall but demonstrates specific behavioral biases:

  • Modern Narrative: High calibration; successfully distinguishes between sincere frustration and ironic punchlines.
  • Literary Irony: Exhibits a "Politeness Bias" where formal syntax is strongly correlated with sincerity (Class 0), leading to potential false negatives in classical irony.
Metric Score
Golden Set F1 0.8889
Human Set F1 0.8000
Threshold (Optimal) 0.60 - 0.75

Intended Use

This model is intended for use in hybrid LLM systems and conversational agents where distinguishing between sincere user complaints and situational irony is critical for deterministic routing.

Downloads last month
46
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train sennatitcomb/sarcasm-detector-json-joshi-gutenberg-final

Evaluation results