--- language: en license: apache-2.0 base_model: microsoft/deberta-v3-large tags: - logical-fallacy-detection - deberta-v3-large - text-classification - argumentation - contrastive-learning - adversarial-training - robust-classification datasets: - logic - cocoLoFa - Navy0067/contrastive-pairs-for-logical-fallacy metrics: - f1 - accuracy model-index: - name: fallacy-detector-binary results: - task: type: text-classification name: Logical Fallacy Detection metrics: - type: f1 value: 0.908 name: F1 Score - type: accuracy value: 0.911 name: Accuracy --- # Logical Fallacy Detector (Binary) A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases. **Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences. **96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** | --- ## ✨ Capabilities ### Detects Common Fallacies - ✅ **Ad Hominem** (attacking person, not argument) - ✅ **Slippery Slope** (exaggerated chain reactions) - ✅ **False Dilemma** (only two options presented) - ✅ **Appeal to Authority** (irrelevant credentials) - ✅ **Hasty Generalization** (insufficient evidence) - ✅ **Post Hoc Ergo Propter Hoc** (correlation ≠ causation) - ✅ **Circular Reasoning** (begging the question) - ✅ **Straw Man** arguments ### Validates Logical Reasoning - ✅ **Formal syllogisms** ("All A are B, X is A, therefore X is B") - ✅ **Mathematical proofs** (deductive reasoning, arithmetic) - ✅ **Scientific explanations** (gravity, photosynthesis, chemistry) - ✅ **Legal arguments** (precedent, policy application) - ✅ **Conditional statements** (if-then logic) ### Edge Case Handling - ✅ **Distinguishes relevant vs irrelevant credential attacks** - Valid: "Color-blind witness can't testify about color" - Fallacy: "Witness shoplifted as a kid, so can't testify about color" - ✅ **True dichotomies vs false dilemmas** - Valid: "The alarm is either armed or disarmed" - Fallacy: "Either ban all cars or accept pollution forever" - ✅ **Valid authority citations vs fallacious appeals** - Valid: "Structural engineers agree based on data" - Fallacy: "Pop star wore these shoes, so they're best" - ✅ **Causal relationships vs correlation** - Valid: "Recalibrating machines increased output" - Fallacy: "Playing Mozart increased output" ### Limitations - ⚠️ **Very short statements** (<10 words) may be misclassified as fallacies - Example: "I like pizza" incorrectly flagged (not an argument) - ⚠️ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing") - ⚠️ **Context-dependent arguments** may need human review - ⚠️ **Domain-specific jargon** may affect accuracy --- ## Model Description Fine-tuned **DeBERTa-v3-large** for binary classification using contrastive learning. ### Training Data **Total training examples**: 6,529 - 5,335 examples from LOGIC and CoCoLoFa datasets - 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples) **Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries. **Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x) --- ## Performance ### Validation Metrics (1,130 examples) | Metric | Score | |--------|-------| | **F1 Score** | 90.8% | | **Accuracy** | 91.1% | | **Precision** | 92.1% | | **Recall** | 89.6% | | **Specificity** | 92.5% | **Error Analysis:** - False Positive Rate: 7.5% (valid arguments incorrectly flagged) - False Negative Rate: 10.4% (fallacies missed) **Confusion Matrix:** - True Negatives: 529 ✓ (Valid → Valid) - False Positives: 43 ✗ (Valid → Fallacy) - False Negatives: 58 ✗ (Fallacy → Valid) - True Positives: 500 ✓ (Fallacy → Fallacy) ### Real-World Testing (55 diverse manual cases) **Accuracy: ~96%** (53/55 correct) **Perfect performance on:** - Formal syllogisms and deductive logic - Mathematical/arithmetic statements - Scientific principles (conservation of mass, photosynthesis, aerodynamics) - Legal reasoning (contract terms, building codes, citizenship) - Policy arguments with evidence **Correctly identifies edge cases:** - ✅ Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant) - ✅ Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion) - ✅ Supply-demand economics (valid principle) vs. Mozart improving machines (false cause) - ✅ Large sample generalization vs. anecdotal evidence **Known errors (2/55):** - ❌ "I like pizza" → Flagged as fallacy (not an argument) - ❌ "Natural essences promote healing" → Classified as valid (circular reasoning) --- ## Usage ```python from transformers import pipeline # Load model classifier = pipeline( "text-classification", model="Navy0067/Fallacy-detector-binary" ) # Example 1: Valid reasoning (formal logic) text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones." result = classifier(text1) # Output: {'label': 'LABEL_0', 'score': 1.00} # LABEL_0 = Valid # Example 2: Fallacy (ad hominem) text2 = "His economic proposal is wrong because he didn't graduate from college." result = classifier(text2) # Output: {'label': 'LABEL_1', 'score': 1.00} # LABEL_1 = Fallacy # Example 3: Fallacy (slippery slope) text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars." result = classifier(text3) # Output: {'label': 'LABEL_1', 'score': 1.00} # Example 4: Valid (evidence-based) text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances." result = classifier(text4) # Output: {'label': 'LABEL_0', 'score': 1.00} # Example 5: Edge case - Relevant credential attack (Valid) text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness." result = classifier(text5) # Output: {'label': 'LABEL_0', 'score': 1.00} # Example 6: Edge case - Irrelevant credential attack (Fallacy) text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve." result = classifier(text6) # Output: {'label': 'LABEL_1', 'score': 1.00} ```` ---- ## Label Mapping: - LABEL_0 = Valid reasoning (no fallacy detected) - LABEL_1 = Contains fallacy ### Training Details Base Model: microsoft/deberta-v3-large Training Configuration: Epochs: 6 Batch size: 4 (effective: 16 with gradient accumulation) Learning rate: 1e-5 Optimizer: AdamW with weight decay 0.01 Scheduler: Cosine with 10% warmup Max sequence length: 256 tokens FP16 training enabled Hardware: Kaggle P100 GPU (~82 minutes training time) Data Strategy: Original LOGIC/CoCoLoFa data (81.7% of training set) Contrastive pairs oversampled 3x (emphasizes boundary learning) ## Dataset The contrastive training pairs used for fine-tuning this model are available at: [Navy0067/contrastive-pairs-for-logical-fallacy](https://huggingface.co/datasets/Navy0067/contrastive-pairs-for-logical-fallacy) ## Contact Author: Navyansh Singh Hugging Face: @Navy0067 Email: Navyansh24102@iiitnr.edu.in ## Citation If you use this model in your research, please cite it as: ```bibtex @misc{singh2026fallacy, author = {Navyansh Singh}, title = {Logical Fallacy Detector: Binary Classification with Contrastive Learning}, year = {2026}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub}, url = {https://huggingface.co/Navy0067/Fallacy-detector-binary} }