Text Classification
Transformers
English
multi-label-classification
dialogue
conversational-ai
gricean-maxims
cooperative-communication
deberta
nlp
pragmatics
Eval Results (legacy)
Instructions to use Pushkar27/GriceBench-Detector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pushkar27/GriceBench-Detector with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Pushkar27/GriceBench-Detector")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Pushkar27/GriceBench-Detector", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Model Card: GriceBench Violation Detector
|
| 2 |
+
|
| 3 |
+
## Model Details
|
| 4 |
+
- **Architecture**: DeBERTa-v3-base with 4 binary classification heads.
|
| 5 |
+
- **Parameters**: 184M
|
| 6 |
+
- **Task**: Multi-label classification of Gricean Maxim violations (Quantity, Quality, Relation, Manner).
|
| 7 |
+
- **Language**: English
|
| 8 |
+
- **Release Date**: March 2026
|
| 9 |
+
|
| 10 |
+
## Performance
|
| 11 |
+
Evaluated on 1,000 held-out Topical-Chat dialogue turns.
|
| 12 |
+
|
| 13 |
+
| Maxim | F1 Score | Precision | Recall | AUC |
|
| 14 |
+
|-------|----------|-----------|--------|-----|
|
| 15 |
+
| Quantity | 1.000 | 1.000 | 1.000 | 1.000 |
|
| 16 |
+
| Quality | 0.928 | 0.866 | 1.000 | 0.999 |
|
| 17 |
+
| Relation | 1.000 | 1.000 | 1.000 | 1.000 |
|
| 18 |
+
| Manner | 0.891 | 0.864 | 0.919 | 0.979 |
|
| 19 |
+
| **Macro Avg** | **0.955** | -- | -- | -- |
|
| 20 |
+
|
| 21 |
+
## Intended Use
|
| 22 |
+
- **Primary Use**: Detecting cooperative failures in AI dialogue systems.
|
| 23 |
+
- **Out-of-Scope**: Detection of hate speech, toxic content, or PII.
|
| 24 |
+
|
| 25 |
+
## Training Data
|
| 26 |
+
- **Source**: Topical-Chat dataset (50,000+ turns).
|
| 27 |
+
- **Labeling**: Two-stage pipeline (Weak Supervision -> Gold Fine-tuning).
|
| 28 |
+
|
| 29 |
+
## Calibration
|
| 30 |
+
The model uses temperature scaling for probability calibration.
|
| 31 |
+
- **Quantity Temp**: 0.90
|
| 32 |
+
- **Quality Temp**: 0.55
|
| 33 |
+
- **Relation Temp**: 0.75
|
| 34 |
+
- **Manner Temp**: 0.45
|