File size: 7,903 Bytes
b2372d4 7480186 b2372d4 309198d b2372d4 212cd71 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | ---
language: en
license: apache-2.0
base_model: microsoft/deberta-v3-large
tags:
- logical-fallacy-detection
- deberta-v3-large
- text-classification
- argumentation
- contrastive-learning
- adversarial-training
- robust-classification
datasets:
- logic
- cocoLoFa
- Navy0067/contrastive-pairs-for-logical-fallacy
metrics:
- f1
- accuracy
model-index:
- name: fallacy-detector-binary
results:
- task:
type: text-classification
name: Logical Fallacy Detection
metrics:
- type: f1
value: 0.908
name: F1 Score
- type: accuracy
value: 0.911
name: Accuracy
---
# Logical Fallacy Detector (Binary)
A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases.
**Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.
**96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** |
---
## β¨ Capabilities
### Detects Common Fallacies
- β
**Ad Hominem** (attacking person, not argument)
- β
**Slippery Slope** (exaggerated chain reactions)
- β
**False Dilemma** (only two options presented)
- β
**Appeal to Authority** (irrelevant credentials)
- β
**Hasty Generalization** (insufficient evidence)
- β
**Post Hoc Ergo Propter Hoc** (correlation β causation)
- β
**Circular Reasoning** (begging the question)
- β
**Straw Man** arguments
### Validates Logical Reasoning
- β
**Formal syllogisms** ("All A are B, X is A, therefore X is B")
- β
**Mathematical proofs** (deductive reasoning, arithmetic)
- β
**Scientific explanations** (gravity, photosynthesis, chemistry)
- β
**Legal arguments** (precedent, policy application)
- β
**Conditional statements** (if-then logic)
### Edge Case Handling
- β
**Distinguishes relevant vs irrelevant credential attacks**
- Valid: "Color-blind witness can't testify about color"
- Fallacy: "Witness shoplifted as a kid, so can't testify about color"
- β
**True dichotomies vs false dilemmas**
- Valid: "The alarm is either armed or disarmed"
- Fallacy: "Either ban all cars or accept pollution forever"
- β
**Valid authority citations vs fallacious appeals**
- Valid: "Structural engineers agree based on data"
- Fallacy: "Pop star wore these shoes, so they're best"
- β
**Causal relationships vs correlation**
- Valid: "Recalibrating machines increased output"
- Fallacy: "Playing Mozart increased output"
### Limitations
- β οΈ **Very short statements** (<10 words) may be misclassified as fallacies
- Example: "I like pizza" incorrectly flagged (not an argument)
- β οΈ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing")
- β οΈ **Context-dependent arguments** may need human review
- β οΈ **Domain-specific jargon** may affect accuracy
---
## Model Description
Fine-tuned **DeBERTa-v3-large** for binary classification using contrastive learning.
### Training Data
**Total training examples**: 6,529
- 5,335 examples from LOGIC and CoCoLoFa datasets
- 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)
**Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.
**Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)
---
## Performance
### Validation Metrics (1,130 examples)
| Metric | Score |
|--------|-------|
| **F1 Score** | 90.8% |
| **Accuracy** | 91.1% |
| **Precision** | 92.1% |
| **Recall** | 89.6% |
| **Specificity** | 92.5% |
**Error Analysis:**
- False Positive Rate: 7.5% (valid arguments incorrectly flagged)
- False Negative Rate: 10.4% (fallacies missed)
**Confusion Matrix:**
- True Negatives: 529 β (Valid β Valid)
- False Positives: 43 β (Valid β Fallacy)
- False Negatives: 58 β (Fallacy β Valid)
- True Positives: 500 β (Fallacy β Fallacy)
### Real-World Testing (55 diverse manual cases)
**Accuracy: ~96%** (53/55 correct)
**Perfect performance on:**
- Formal syllogisms and deductive logic
- Mathematical/arithmetic statements
- Scientific principles (conservation of mass, photosynthesis, aerodynamics)
- Legal reasoning (contract terms, building codes, citizenship)
- Policy arguments with evidence
**Correctly identifies edge cases:**
- β
Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
- β
Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
- β
Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
- β
Large sample generalization vs. anecdotal evidence
**Known errors (2/55):**
- β "I like pizza" β Flagged as fallacy (not an argument)
- β "Natural essences promote healing" β Classified as valid (circular reasoning)
---
## Usage
```python
from transformers import pipeline
# Load model
classifier = pipeline(
"text-classification",
model="Navy0067/Fallacy-detector-binary"
)
# Example 1: Valid reasoning (formal logic)
text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
result = classifier(text1)
# Output: {'label': 'LABEL_0', 'score': 1.00} # LABEL_0 = Valid
# Example 2: Fallacy (ad hominem)
text2 = "His economic proposal is wrong because he didn't graduate from college."
result = classifier(text2)
# Output: {'label': 'LABEL_1', 'score': 1.00} # LABEL_1 = Fallacy
# Example 3: Fallacy (slippery slope)
text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
result = classifier(text3)
# Output: {'label': 'LABEL_1', 'score': 1.00}
# Example 4: Valid (evidence-based)
text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
result = classifier(text4)
# Output: {'label': 'LABEL_0', 'score': 1.00}
# Example 5: Edge case - Relevant credential attack (Valid)
text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
result = classifier(text5)
# Output: {'label': 'LABEL_0', 'score': 1.00}
# Example 6: Edge case - Irrelevant credential attack (Fallacy)
text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
result = classifier(text6)
# Output: {'label': 'LABEL_1', 'score': 1.00}
````
----
## Label Mapping:
- LABEL_0 = Valid reasoning (no fallacy detected)
- LABEL_1 = Contains fallacy
### Training Details
Base Model: microsoft/deberta-v3-large
Training Configuration:
Epochs: 6
Batch size: 4 (effective: 16 with gradient accumulation)
Learning rate: 1e-5
Optimizer: AdamW with weight decay 0.01
Scheduler: Cosine with 10% warmup
Max sequence length: 256 tokens
FP16 training enabled
Hardware: Kaggle P100 GPU (~82 minutes training time)
Data Strategy:
Original LOGIC/CoCoLoFa data (81.7% of training set)
Contrastive pairs oversampled 3x (emphasizes boundary learning)
## Dataset
The contrastive training pairs used for fine-tuning this model are available at:
[Navy0067/contrastive-pairs-for-logical-fallacy](https://huggingface.co/datasets/Navy0067/contrastive-pairs-for-logical-fallacy)
## Contact
Author: Navyansh Singh
Hugging Face: @Navy0067
Email: Navyansh24102@iiitnr.edu.in
## Citation
If you use this model in your research, please cite it as:
```bibtex
@misc{singh2026fallacy,
author = {Navyansh Singh},
title = {Logical Fallacy Detector: Binary Classification with Contrastive Learning},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
url = {https://huggingface.co/Navy0067/Fallacy-detector-binary}
}
|