Navy0067's picture
Update README.md
f0aeba0 verified
---
language: en
license: apache-2.0
base_model: microsoft/deberta-v3-large
tags:
- logical-fallacy-detection
- deberta-v3-large
- text-classification
- argumentation
- contrastive-learning
- adversarial-training
- robust-classification
datasets:
- logic
- cocoLoFa
- Navy0067/contrastive-pairs-for-logical-fallacy
metrics:
- f1
- accuracy
model-index:
- name: fallacy-detector-binary
results:
- task:
type: text-classification
name: Logical Fallacy Detection
metrics:
- type: f1
value: 0.908
name: F1 Score
- type: accuracy
value: 0.911
name: Accuracy
---
# Logical Fallacy Detector (Binary)
A binary classifier distinguishing **valid reasoning** from **fallacious arguments**, trained with contrastive adversarial examples to handle subtle boundary cases.
**Key Innovation:** Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.
**96% accuracy on diverse real-world test cases** | **Handles edge cases**| **91% F1** |
---
## ✨ Capabilities
### Detects Common Fallacies
- βœ… **Ad Hominem** (attacking person, not argument)
- βœ… **Slippery Slope** (exaggerated chain reactions)
- βœ… **False Dilemma** (only two options presented)
- βœ… **Appeal to Authority** (irrelevant credentials)
- βœ… **Hasty Generalization** (insufficient evidence)
- βœ… **Post Hoc Ergo Propter Hoc** (correlation β‰  causation)
- βœ… **Circular Reasoning** (begging the question)
- βœ… **Straw Man** arguments
### Validates Logical Reasoning
- βœ… **Formal syllogisms** ("All A are B, X is A, therefore X is B")
- βœ… **Mathematical proofs** (deductive reasoning, arithmetic)
- βœ… **Scientific explanations** (gravity, photosynthesis, chemistry)
- βœ… **Legal arguments** (precedent, policy application)
- βœ… **Conditional statements** (if-then logic)
### Edge Case Handling
- βœ… **Distinguishes relevant vs irrelevant credential attacks**
- Valid: "Color-blind witness can't testify about color"
- Fallacy: "Witness shoplifted as a kid, so can't testify about color"
- βœ… **True dichotomies vs false dilemmas**
- Valid: "The alarm is either armed or disarmed"
- Fallacy: "Either ban all cars or accept pollution forever"
- βœ… **Valid authority citations vs fallacious appeals**
- Valid: "Structural engineers agree based on data"
- Fallacy: "Pop star wore these shoes, so they're best"
- βœ… **Causal relationships vs correlation**
- Valid: "Recalibrating machines increased output"
- Fallacy: "Playing Mozart increased output"
### Limitations
- ⚠️ **Very short statements** (<10 words) may be misclassified as fallacies
- Example: "I like pizza" incorrectly flagged (not an argument)
- ⚠️ **Circular reasoning** occasionally missed (e.g., "healing essences promote healing")
- ⚠️ **Context-dependent arguments** may need human review
- ⚠️ **Domain-specific jargon** may affect accuracy
---
## Model Description
Fine-tuned **DeBERTa-v3-large** for binary classification using contrastive learning.
### Training Data
**Total training examples**: 6,529
- 5,335 examples from LOGIC and CoCoLoFa datasets
- 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)
**Contrastive learning approach**: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.
**Test set**: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)
---
## Performance
### Validation Metrics (1,130 examples)
| Metric | Score |
|--------|-------|
| **F1 Score** | 90.8% |
| **Accuracy** | 91.1% |
| **Precision** | 92.1% |
| **Recall** | 89.6% |
| **Specificity** | 92.5% |
**Error Analysis:**
- False Positive Rate: 7.5% (valid arguments incorrectly flagged)
- False Negative Rate: 10.4% (fallacies missed)
**Confusion Matrix:**
- True Negatives: 529 βœ“ (Valid β†’ Valid)
- False Positives: 43 βœ— (Valid β†’ Fallacy)
- False Negatives: 58 βœ— (Fallacy β†’ Valid)
- True Positives: 500 βœ“ (Fallacy β†’ Fallacy)
### Real-World Testing (55 diverse manual cases)
**Accuracy: ~96%** (53/55 correct)
**Perfect performance on:**
- Formal syllogisms and deductive logic
- Mathematical/arithmetic statements
- Scientific principles (conservation of mass, photosynthesis, aerodynamics)
- Legal reasoning (contract terms, building codes, citizenship)
- Policy arguments with evidence
**Correctly identifies edge cases:**
- βœ… Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
- βœ… Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
- βœ… Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
- βœ… Large sample generalization vs. anecdotal evidence
**Known errors (2/55):**
- ❌ "I like pizza" β†’ Flagged as fallacy (not an argument)
- ❌ "Natural essences promote healing" β†’ Classified as valid (circular reasoning)
---
## Usage
```python
from transformers import pipeline
# Load model
classifier = pipeline(
"text-classification",
model="Navy0067/Fallacy-detector-binary"
)
# Example 1: Valid reasoning (formal logic)
text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
result = classifier(text1)
# Output: {'label': 'LABEL_0', 'score': 1.00} # LABEL_0 = Valid
# Example 2: Fallacy (ad hominem)
text2 = "His economic proposal is wrong because he didn't graduate from college."
result = classifier(text2)
# Output: {'label': 'LABEL_1', 'score': 1.00} # LABEL_1 = Fallacy
# Example 3: Fallacy (slippery slope)
text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
result = classifier(text3)
# Output: {'label': 'LABEL_1', 'score': 1.00}
# Example 4: Valid (evidence-based)
text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
result = classifier(text4)
# Output: {'label': 'LABEL_0', 'score': 1.00}
# Example 5: Edge case - Relevant credential attack (Valid)
text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
result = classifier(text5)
# Output: {'label': 'LABEL_0', 'score': 1.00}
# Example 6: Edge case - Irrelevant credential attack (Fallacy)
text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
result = classifier(text6)
# Output: {'label': 'LABEL_1', 'score': 1.00}
````
----
## Label Mapping:
- LABEL_0 = Valid reasoning (no fallacy detected)
- LABEL_1 = Contains fallacy
### Training Details
Base Model: microsoft/deberta-v3-large
Training Configuration:
Epochs: 6
Batch size: 4 (effective: 16 with gradient accumulation)
Learning rate: 1e-5
Optimizer: AdamW with weight decay 0.01
Scheduler: Cosine with 10% warmup
Max sequence length: 256 tokens
FP16 training enabled
Hardware: Kaggle P100 GPU (~82 minutes training time)
Data Strategy:
Original LOGIC/CoCoLoFa data (81.7% of training set)
Contrastive pairs oversampled 3x (emphasizes boundary learning)
## Dataset
The contrastive training pairs used for fine-tuning this model are available at:
[Navy0067/contrastive-pairs-for-logical-fallacy](https://huggingface.co/datasets/Navy0067/contrastive-pairs-for-logical-fallacy)
## Contact
Author: Navyansh Singh
Hugging Face: @Navy0067
Email: Navyansh24102@iiitnr.edu.in
## Citation
If you use this model in your research, please cite it as:
```bibtex
@misc{singh2026fallacy,
author = {Navyansh Singh},
title = {Logical Fallacy Detector: Binary Classification with Contrastive Learning},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
url = {https://huggingface.co/Navy0067/Fallacy-detector-binary}
}