Navy0067
/

Fallacy-detector-binary

+---
+language: en
+license: apache-2.0
+base_model: microsoft/deberta-v3-large
+tags:
+- logical-fallacy-detection
+- deberta-v3-large
+- text-classification
+- argumentation
+- contrastive-learning
+- adversarial-training
+- robust-classification
+datasets:
+- logic
+- cocoLoFa
+- Navy0067/contrastive-pairs-for-logical-fallacy
+metrics:
+- f1
+- accuracy
+model-index:
+- name: fallacy-detector-binary
+  results:
+  - task:
+      type: text-classification
+      name: Logical Fallacy Detection
+    metrics:
+    - type: f1
+      value: 0.908
+      name: F1 Score
+    - type: accuracy
+      value: 0.911
+      name: Accuracy
+widget:
+- text: "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
+  example_title: "Valid - Syllogism"
+- text: "His economic proposal is wrong because he didn't graduate from college."
+  example_title: "Fallacy - Ad Hominem"
+- text: "If we allow one streetlamp, they'll install them every five feet and destroy our view of stars."
+  example_title: "Fallacy - Slippery Slope"
+- text: "The witness's color testimony should be questioned because he was diagnosed with color blindness."
+  example_title: "Valid - Relevant Credential"
+- text: "The witness's testimony should be questioned because he shoplifted as a kid."
+  example_title: "Fallacy - Irrelevant Attack"
+- text: "95% of patients following physical therapy regained mobility, thus the regimen increases recovery."
+  example_title: "Valid - Evidence-Based"
+- text: "I met two lazy students from that university, so the entire student body must be unmotivated."
+  example_title: "Fallacy - Hasty Generalization"
+- text: "Every time I wear red socks, the team wins; I must wear them tomorrow to ensure victory."
+  example_title: "Fallacy - False Cause"
+---
+# Logical Fallacy Detector (Binary)
+A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases.
+**Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.
+**96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** |
+---
+## ✨ Capabilities
+### Detects Common Fallacies
+- ✅ **Ad Hominem** (attacking person, not argument)
+- ✅ **Slippery Slope** (exaggerated chain reactions)
+- ✅ **False Dilemma** (only two options presented)
+- ✅ **Appeal to Authority** (irrelevant credentials)
+- ✅ **Hasty Generalization** (insufficient evidence)
+- ✅ **Post Hoc Ergo Propter Hoc** (correlation ≠ causation)
+- ✅ **Circular Reasoning** (begging the question)
+- ✅ **Straw Man** arguments
+### Validates Logical Reasoning
+- ✅ **Formal syllogisms** ("All A are B, X is A, therefore X is B")
+- ✅ **Mathematical proofs** (deductive reasoning, arithmetic)
+- ✅ **Scientific explanations** (gravity, photosynthesis, chemistry)
+- ✅ **Legal arguments** (precedent, policy application)
+- ✅ **Conditional statements** (if-then logic)
+### Edge Case Handling
+- ✅ **Distinguishes relevant vs irrelevant credential attacks**
+  - Valid: "Color-blind witness can't testify about color"
+  - Fallacy: "Witness shoplifted as a kid, so can't testify about color"
+- ✅ **True dichotomies vs false dilemmas**
+  - Valid: "The alarm is either armed or disarmed"
+  - Fallacy: "Either ban all cars or accept pollution forever"
+- ✅ **Valid authority citations vs fallacious appeals**
+  - Valid: "Structural engineers agree based on data"
+  - Fallacy: "Pop star wore these shoes, so they're best"
+- ✅ **Causal relationships vs correlation**
+  - Valid: "Recalibrating machines increased output"
+  - Fallacy: "Playing Mozart increased output"
+### Limitations
+- ⚠️ **Very short statements** (<10 words) may be misclassified as fallacies
+  - Example: "I like pizza" incorrectly flagged (not an argument)
+- ⚠️ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing")
+- ⚠️ **Context-dependent arguments** may need human review
+- ⚠️ **Domain-specific jargon** may affect accuracy
+---
+## Model Description
+Fine-tuned **DeBERTa-v3-large** (184M parameters) for binary classification using contrastive learning.
+### Training Data
+**Total training examples**: 6,529
+- 5,335 examples from LOGIC and CoCoLoFa datasets
+- 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)
+**Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.
+**Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)
+---
+## Performance
+### Validation Metrics (1,130 examples)
+| Metric | Score |
+|--------|-------|
+| **F1 Score** | 90.8% |
+| **Accuracy** | 91.1% |
+| **Precision** | 92.1% |
+| **Recall** | 89.6% |
+| **Specificity** | 92.5% |
+**Error Analysis:**
+- False Positive Rate: 7.5% (valid arguments incorrectly flagged)
+- False Negative Rate: 10.4% (fallacies missed)
+**Confusion Matrix:**
+- True Negatives: 529 ✓ (Valid → Valid)
+- False Positives: 43 ✗ (Valid → Fallacy)
+- False Negatives: 58 ✗ (Fallacy → Valid)
+- True Positives: 500 ✓ (Fallacy → Fallacy)
+### Real-World Testing (55 diverse manual cases)
+**Accuracy: ~96%** (53/55 correct)
+**Perfect performance on:**
+- Formal syllogisms and deductive logic
+- Mathematical/arithmetic statements
+- Scientific principles (conservation of mass, photosynthesis, aerodynamics)
+- Legal reasoning (contract terms, building codes, citizenship)
+- Policy arguments with evidence
+**Correctly identifies edge cases:**
+- ✅ Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
+- ✅ Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
+- ✅ Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
+- ✅ Large sample generalization vs. anecdotal evidence
+**Known errors (2/55):**
+- ❌ "I like pizza" → Flagged as fallacy (not an argument)
+- ❌ "Natural essences promote healing" → Classified as valid (circular reasoning)
+---
+## Usage
+```python
+from transformers import pipeline
+# Load model
+classifier = pipeline(
+    "text-classification",
+    model="Navy0067/Fallacy-detector-binary"
+)
+# Example 1: Valid reasoning (formal logic)
+text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
+result = classifier(text1)
+# Output: {'label': 'LABEL_0', 'score': 1.00}  # LABEL_0 = Valid
+# Example 2: Fallacy (ad hominem)
+text2 = "His economic proposal is wrong because he didn't graduate from college."
+result = classifier(text2)
+# Output: {'label': 'LABEL_1', 'score': 1.00}  # LABEL_1 = Fallacy
+# Example 3: Fallacy (slippery slope)
+text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
+result = classifier(text3)
+# Output: {'label': 'LABEL_1', 'score': 1.00}
+# Example 4: Valid (evidence-based)
+text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
+result = classifier(text4)
+# Output: {'label': 'LABEL_0', 'score': 1.00}
+# Example 5: Edge case - Relevant credential attack (Valid)
+text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
+result = classifier(text5)
+# Output: {'label': 'LABEL_0', 'score': 1.00}
+# Example 6: Edge case - Irrelevant credential attack (Fallacy)
+text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
+result = classifier(text6)
+# Output: {'label': 'LABEL_1', 'score': 1.00}
+````
+----
+## Label Mapping:
+- LABEL_0 = Valid reasoning (no fallacy detected)
+- LABEL_1 = Contains fallacy
+### Training Details
+Base Model: microsoft/deberta-v3-large (184M parameters)
+Training Configuration:
+Epochs: 6
+Batch size: 4 (effective: 16 with gradient accumulation)
+Learning rate: 1e-5
+Optimizer: AdamW with weight decay 0.01
+Scheduler: Cosine with 10% warmup
+Max sequence length: 256 tokens
+FP16 training enabled
+Hardware: Kaggle P100 GPU (~82 minutes training time)
+Data Strategy:
+Original LOGIC/CoCoLoFa data (81.7% of training set)
+Contrastive pairs oversampled 3x (emphasizes boundary learning)
+Final balance: 50.3% fallacies, 49.7% valid