File size: 7,903 Bytes
b2372d4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7480186
b2372d4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
309198d
b2372d4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
212cd71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
---
language: en
license: apache-2.0
base_model: microsoft/deberta-v3-large
tags:
- logical-fallacy-detection
- deberta-v3-large
- text-classification
- argumentation
- contrastive-learning
- adversarial-training
- robust-classification
datasets:
- logic
- cocoLoFa
- Navy0067/contrastive-pairs-for-logical-fallacy
metrics:
- f1
- accuracy
model-index:
- name: fallacy-detector-binary
  results:
  - task:
      type: text-classification
      name: Logical Fallacy Detection
    metrics:
    - type: f1
      value: 0.908
      name: F1 Score
    - type: accuracy
      value: 0.911
      name: Accuracy

      
---
# Logical Fallacy Detector (Binary)

A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases.

**Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.

**96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** |

---

## ✨ Capabilities

### Detects Common Fallacies
- βœ… **Ad Hominem** (attacking person, not argument)
- βœ… **Slippery Slope** (exaggerated chain reactions)
- βœ… **False Dilemma** (only two options presented)
- βœ… **Appeal to Authority** (irrelevant credentials)
- βœ… **Hasty Generalization** (insufficient evidence)
- βœ… **Post Hoc Ergo Propter Hoc** (correlation β‰  causation)
- βœ… **Circular Reasoning** (begging the question)
- βœ… **Straw Man** arguments

### Validates Logical Reasoning
- βœ… **Formal syllogisms** ("All A are B, X is A, therefore X is B")
- βœ… **Mathematical proofs** (deductive reasoning, arithmetic)
- βœ… **Scientific explanations** (gravity, photosynthesis, chemistry)
- βœ… **Legal arguments** (precedent, policy application)
- βœ… **Conditional statements** (if-then logic)

### Edge Case Handling
- βœ… **Distinguishes relevant vs irrelevant credential attacks**
  - Valid: "Color-blind witness can't testify about color"
  - Fallacy: "Witness shoplifted as a kid, so can't testify about color"
- βœ… **True dichotomies vs false dilemmas**
  - Valid: "The alarm is either armed or disarmed"
  - Fallacy: "Either ban all cars or accept pollution forever"
- βœ… **Valid authority citations vs fallacious appeals**
  - Valid: "Structural engineers agree based on data"
  - Fallacy: "Pop star wore these shoes, so they're best"
- βœ… **Causal relationships vs correlation**
  - Valid: "Recalibrating machines increased output"
  - Fallacy: "Playing Mozart increased output"

### Limitations
- ⚠️ **Very short statements** (<10 words) may be misclassified as fallacies
  - Example: "I like pizza" incorrectly flagged (not an argument)
- ⚠️ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing")
- ⚠️ **Context-dependent arguments** may need human review
- ⚠️ **Domain-specific jargon** may affect accuracy

---

## Model Description

Fine-tuned **DeBERTa-v3-large**  for binary classification using contrastive learning.

### Training Data

**Total training examples**: 6,529
- 5,335 examples from LOGIC and CoCoLoFa datasets
- 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)

**Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.

**Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)

---

## Performance

### Validation Metrics (1,130 examples)

| Metric | Score |
|--------|-------|
| **F1 Score** | 90.8% |
| **Accuracy** | 91.1% |
| **Precision** | 92.1% |
| **Recall** | 89.6% |
| **Specificity** | 92.5% |

**Error Analysis:**
- False Positive Rate: 7.5% (valid arguments incorrectly flagged)
- False Negative Rate: 10.4% (fallacies missed)

**Confusion Matrix:**
- True Negatives: 529 βœ“ (Valid β†’ Valid)
- False Positives: 43 βœ— (Valid β†’ Fallacy)
- False Negatives: 58 βœ— (Fallacy β†’ Valid)
- True Positives: 500 βœ“ (Fallacy β†’ Fallacy)

### Real-World Testing (55 diverse manual cases)

**Accuracy: ~96%** (53/55 correct)

**Perfect performance on:**
- Formal syllogisms and deductive logic
- Mathematical/arithmetic statements
- Scientific principles (conservation of mass, photosynthesis, aerodynamics)
- Legal reasoning (contract terms, building codes, citizenship)
- Policy arguments with evidence

**Correctly identifies edge cases:**
- βœ… Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
- βœ… Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
- βœ… Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
- βœ… Large sample generalization vs. anecdotal evidence

**Known errors (2/55):**
- ❌ "I like pizza" β†’ Flagged as fallacy (not an argument)
- ❌ "Natural essences promote healing" β†’ Classified as valid (circular reasoning)

---

## Usage

```python
from transformers import pipeline

# Load model
classifier = pipeline(
    "text-classification",
    model="Navy0067/Fallacy-detector-binary"
)

# Example 1: Valid reasoning (formal logic)
text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
result = classifier(text1)
# Output: {'label': 'LABEL_0', 'score': 1.00}  # LABEL_0 = Valid

# Example 2: Fallacy (ad hominem)
text2 = "His economic proposal is wrong because he didn't graduate from college."
result = classifier(text2)
# Output: {'label': 'LABEL_1', 'score': 1.00}  # LABEL_1 = Fallacy

# Example 3: Fallacy (slippery slope)
text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
result = classifier(text3)
# Output: {'label': 'LABEL_1', 'score': 1.00}

# Example 4: Valid (evidence-based)
text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
result = classifier(text4)
# Output: {'label': 'LABEL_0', 'score': 1.00}

# Example 5: Edge case - Relevant credential attack (Valid)
text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
result = classifier(text5)
# Output: {'label': 'LABEL_0', 'score': 1.00}

# Example 6: Edge case - Irrelevant credential attack (Fallacy)
text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
result = classifier(text6)
# Output: {'label': 'LABEL_1', 'score': 1.00}
````
---- 

## Label Mapping:

- LABEL_0 = Valid reasoning (no fallacy detected)

- LABEL_1 = Contains fallacy

### Training Details
Base Model: microsoft/deberta-v3-large 
Training Configuration:

Epochs: 6

Batch size: 4 (effective: 16 with gradient accumulation)

Learning rate: 1e-5

Optimizer: AdamW with weight decay 0.01

Scheduler: Cosine with 10% warmup

Max sequence length: 256 tokens

FP16 training enabled

Hardware: Kaggle P100 GPU (~82 minutes training time)

Data Strategy:

Original LOGIC/CoCoLoFa data (81.7% of training set)

Contrastive pairs oversampled 3x (emphasizes boundary learning)

## Dataset

The contrastive training pairs used for fine-tuning this model are available at:
[Navy0067/contrastive-pairs-for-logical-fallacy](https://huggingface.co/datasets/Navy0067/contrastive-pairs-for-logical-fallacy)

## Contact
Author: Navyansh Singh

Hugging Face: @Navy0067

Email: Navyansh24102@iiitnr.edu.in


## Citation

If you use this model in your research, please cite it as:

```bibtex
@misc{singh2026fallacy,
  author       = {Navyansh Singh},
  title        = {Logical Fallacy Detector: Binary Classification with Contrastive Learning},
  year         = {2026},
  publisher    = {Hugging Face},
  journal      = {Hugging Face Model Hub},
  url          = {https://huggingface.co/Navy0067/Fallacy-detector-binary}
}