Navy0067 commited on
Commit
b2372d4
Β·
verified Β·
1 Parent(s): 58dd5ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +242 -3
README.md CHANGED
@@ -1,3 +1,242 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ base_model: microsoft/deberta-v3-large
5
+ tags:
6
+ - logical-fallacy-detection
7
+ - deberta-v3-large
8
+ - text-classification
9
+ - argumentation
10
+ - contrastive-learning
11
+ - adversarial-training
12
+ - robust-classification
13
+ datasets:
14
+ - logic
15
+ - cocoLoFa
16
+ - Navy0067/contrastive-pairs-for-logical-fallacy
17
+ metrics:
18
+ - f1
19
+ - accuracy
20
+ model-index:
21
+ - name: fallacy-detector-binary
22
+ results:
23
+ - task:
24
+ type: text-classification
25
+ name: Logical Fallacy Detection
26
+ metrics:
27
+ - type: f1
28
+ value: 0.908
29
+ name: F1 Score
30
+ - type: accuracy
31
+ value: 0.911
32
+ name: Accuracy
33
+ widget:
34
+ - text: "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
35
+ example_title: "Valid - Syllogism"
36
+ - text: "His economic proposal is wrong because he didn't graduate from college."
37
+ example_title: "Fallacy - Ad Hominem"
38
+ - text: "If we allow one streetlamp, they'll install them every five feet and destroy our view of stars."
39
+ example_title: "Fallacy - Slippery Slope"
40
+ - text: "The witness's color testimony should be questioned because he was diagnosed with color blindness."
41
+ example_title: "Valid - Relevant Credential"
42
+ - text: "The witness's testimony should be questioned because he shoplifted as a kid."
43
+ example_title: "Fallacy - Irrelevant Attack"
44
+ - text: "95% of patients following physical therapy regained mobility, thus the regimen increases recovery."
45
+ example_title: "Valid - Evidence-Based"
46
+ - text: "I met two lazy students from that university, so the entire student body must be unmotivated."
47
+ example_title: "Fallacy - Hasty Generalization"
48
+ - text: "Every time I wear red socks, the team wins; I must wear them tomorrow to ensure victory."
49
+ example_title: "Fallacy - False Cause"
50
+
51
+
52
+ ---
53
+ # Logical Fallacy Detector (Binary)
54
+
55
+ A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases.
56
+
57
+ **Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.
58
+
59
+ **96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** |
60
+
61
+ ---
62
+
63
+ ## ✨ Capabilities
64
+
65
+ ### Detects Common Fallacies
66
+ - βœ… **Ad Hominem** (attacking person, not argument)
67
+ - βœ… **Slippery Slope** (exaggerated chain reactions)
68
+ - βœ… **False Dilemma** (only two options presented)
69
+ - βœ… **Appeal to Authority** (irrelevant credentials)
70
+ - βœ… **Hasty Generalization** (insufficient evidence)
71
+ - βœ… **Post Hoc Ergo Propter Hoc** (correlation β‰  causation)
72
+ - βœ… **Circular Reasoning** (begging the question)
73
+ - βœ… **Straw Man** arguments
74
+
75
+ ### Validates Logical Reasoning
76
+ - βœ… **Formal syllogisms** ("All A are B, X is A, therefore X is B")
77
+ - βœ… **Mathematical proofs** (deductive reasoning, arithmetic)
78
+ - βœ… **Scientific explanations** (gravity, photosynthesis, chemistry)
79
+ - βœ… **Legal arguments** (precedent, policy application)
80
+ - βœ… **Conditional statements** (if-then logic)
81
+
82
+ ### Edge Case Handling
83
+ - βœ… **Distinguishes relevant vs irrelevant credential attacks**
84
+ - Valid: "Color-blind witness can't testify about color"
85
+ - Fallacy: "Witness shoplifted as a kid, so can't testify about color"
86
+ - βœ… **True dichotomies vs false dilemmas**
87
+ - Valid: "The alarm is either armed or disarmed"
88
+ - Fallacy: "Either ban all cars or accept pollution forever"
89
+ - βœ… **Valid authority citations vs fallacious appeals**
90
+ - Valid: "Structural engineers agree based on data"
91
+ - Fallacy: "Pop star wore these shoes, so they're best"
92
+ - βœ… **Causal relationships vs correlation**
93
+ - Valid: "Recalibrating machines increased output"
94
+ - Fallacy: "Playing Mozart increased output"
95
+
96
+ ### Limitations
97
+ - ⚠️ **Very short statements** (<10 words) may be misclassified as fallacies
98
+ - Example: "I like pizza" incorrectly flagged (not an argument)
99
+ - ⚠️ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing")
100
+ - ⚠️ **Context-dependent arguments** may need human review
101
+ - ⚠️ **Domain-specific jargon** may affect accuracy
102
+
103
+ ---
104
+
105
+ ## Model Description
106
+
107
+ Fine-tuned **DeBERTa-v3-large** (184M parameters) for binary classification using contrastive learning.
108
+
109
+ ### Training Data
110
+
111
+ **Total training examples**: 6,529
112
+ - 5,335 examples from LOGIC and CoCoLoFa datasets
113
+ - 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)
114
+
115
+ **Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.
116
+
117
+ **Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)
118
+
119
+ ---
120
+
121
+ ## Performance
122
+
123
+ ### Validation Metrics (1,130 examples)
124
+
125
+ | Metric | Score |
126
+ |--------|-------|
127
+ | **F1 Score** | 90.8% |
128
+ | **Accuracy** | 91.1% |
129
+ | **Precision** | 92.1% |
130
+ | **Recall** | 89.6% |
131
+ | **Specificity** | 92.5% |
132
+
133
+ **Error Analysis:**
134
+ - False Positive Rate: 7.5% (valid arguments incorrectly flagged)
135
+ - False Negative Rate: 10.4% (fallacies missed)
136
+
137
+ **Confusion Matrix:**
138
+ - True Negatives: 529 βœ“ (Valid β†’ Valid)
139
+ - False Positives: 43 βœ— (Valid β†’ Fallacy)
140
+ - False Negatives: 58 βœ— (Fallacy β†’ Valid)
141
+ - True Positives: 500 βœ“ (Fallacy β†’ Fallacy)
142
+
143
+ ### Real-World Testing (55 diverse manual cases)
144
+
145
+ **Accuracy: ~96%** (53/55 correct)
146
+
147
+ **Perfect performance on:**
148
+ - Formal syllogisms and deductive logic
149
+ - Mathematical/arithmetic statements
150
+ - Scientific principles (conservation of mass, photosynthesis, aerodynamics)
151
+ - Legal reasoning (contract terms, building codes, citizenship)
152
+ - Policy arguments with evidence
153
+
154
+ **Correctly identifies edge cases:**
155
+ - βœ… Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
156
+ - βœ… Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
157
+ - βœ… Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
158
+ - βœ… Large sample generalization vs. anecdotal evidence
159
+
160
+ **Known errors (2/55):**
161
+ - ❌ "I like pizza" β†’ Flagged as fallacy (not an argument)
162
+ - ❌ "Natural essences promote healing" β†’ Classified as valid (circular reasoning)
163
+
164
+ ---
165
+
166
+ ## Usage
167
+
168
+ ```python
169
+ from transformers import pipeline
170
+
171
+ # Load model
172
+ classifier = pipeline(
173
+ "text-classification",
174
+ model="Navy0067/Fallacy-detector-binary"
175
+ )
176
+
177
+ # Example 1: Valid reasoning (formal logic)
178
+ text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
179
+ result = classifier(text1)
180
+ # Output: {'label': 'LABEL_0', 'score': 1.00} # LABEL_0 = Valid
181
+
182
+ # Example 2: Fallacy (ad hominem)
183
+ text2 = "His economic proposal is wrong because he didn't graduate from college."
184
+ result = classifier(text2)
185
+ # Output: {'label': 'LABEL_1', 'score': 1.00} # LABEL_1 = Fallacy
186
+
187
+ # Example 3: Fallacy (slippery slope)
188
+ text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
189
+ result = classifier(text3)
190
+ # Output: {'label': 'LABEL_1', 'score': 1.00}
191
+
192
+ # Example 4: Valid (evidence-based)
193
+ text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
194
+ result = classifier(text4)
195
+ # Output: {'label': 'LABEL_0', 'score': 1.00}
196
+
197
+ # Example 5: Edge case - Relevant credential attack (Valid)
198
+ text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
199
+ result = classifier(text5)
200
+ # Output: {'label': 'LABEL_0', 'score': 1.00}
201
+
202
+ # Example 6: Edge case - Irrelevant credential attack (Fallacy)
203
+ text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
204
+ result = classifier(text6)
205
+ # Output: {'label': 'LABEL_1', 'score': 1.00}
206
+ ````
207
+ ----
208
+
209
+ ## Label Mapping:
210
+
211
+ - LABEL_0 = Valid reasoning (no fallacy detected)
212
+
213
+ - LABEL_1 = Contains fallacy
214
+
215
+ ### Training Details
216
+ Base Model: microsoft/deberta-v3-large (184M parameters)
217
+
218
+ Training Configuration:
219
+
220
+ Epochs: 6
221
+
222
+ Batch size: 4 (effective: 16 with gradient accumulation)
223
+
224
+ Learning rate: 1e-5
225
+
226
+ Optimizer: AdamW with weight decay 0.01
227
+
228
+ Scheduler: Cosine with 10% warmup
229
+
230
+ Max sequence length: 256 tokens
231
+
232
+ FP16 training enabled
233
+
234
+ Hardware: Kaggle P100 GPU (~82 minutes training time)
235
+
236
+ Data Strategy:
237
+
238
+ Original LOGIC/CoCoLoFa data (81.7% of training set)
239
+
240
+ Contrastive pairs oversampled 3x (emphasizes boundary learning)
241
+
242
+ Final balance: 50.3% fallacies, 49.7% valid