Update README.md

f0aeba0 verified 2 months ago

7.9 kB

	---
	language: en
	license: apache-2.0
	base_model: microsoft/deberta-v3-large
	tags:
	- logical-fallacy-detection
	- deberta-v3-large
	- text-classification
	- argumentation
	- contrastive-learning
	- adversarial-training
	- robust-classification
	datasets:
	- logic
	- cocoLoFa
	- Navy0067/contrastive-pairs-for-logical-fallacy
	metrics:
	- f1
	- accuracy
	model-index:
	- name: fallacy-detector-binary
	results:
	- task:
	type: text-classification
	name: Logical Fallacy Detection
	metrics:
	- type: f1
	value: 0.908
	name: F1 Score
	- type: accuracy
	value: 0.911
	name: Accuracy


	---
	# Logical Fallacy Detector (Binary)

	A binary classifier distinguishing valid reasoning from fallacious arguments, trained with contrastive adversarial examples to handle subtle boundary cases.

	Key Innovation: Contrastive learning with 703 adversarial argument pairs where similar wording masks critical reasoning differences.

	96% accuracy on diverse real-world test cases \| Handles edge cases\| 91% F1 \|

	---

	## ✨ Capabilities

	### Detects Common Fallacies
	- ✅ Ad Hominem (attacking person, not argument)
	- ✅ Slippery Slope (exaggerated chain reactions)
	- ✅ False Dilemma (only two options presented)
	- ✅ Appeal to Authority (irrelevant credentials)
	- ✅ Hasty Generalization (insufficient evidence)
	- ✅ Post Hoc Ergo Propter Hoc (correlation ≠ causation)
	- ✅ Circular Reasoning (begging the question)
	- ✅ Straw Man arguments

	### Validates Logical Reasoning
	- ✅ Formal syllogisms ("All A are B, X is A, therefore X is B")
	- ✅ Mathematical proofs (deductive reasoning, arithmetic)
	- ✅ Scientific explanations (gravity, photosynthesis, chemistry)
	- ✅ Legal arguments (precedent, policy application)
	- ✅ Conditional statements (if-then logic)

	### Edge Case Handling
	- ✅ Distinguishes relevant vs irrelevant credential attacks
	- Valid: "Color-blind witness can't testify about color"
	- Fallacy: "Witness shoplifted as a kid, so can't testify about color"
	- ✅ True dichotomies vs false dilemmas
	- Valid: "The alarm is either armed or disarmed"
	- Fallacy: "Either ban all cars or accept pollution forever"
	- ✅ Valid authority citations vs fallacious appeals
	- Valid: "Structural engineers agree based on data"
	- Fallacy: "Pop star wore these shoes, so they're best"
	- ✅ Causal relationships vs correlation
	- Valid: "Recalibrating machines increased output"
	- Fallacy: "Playing Mozart increased output"

	### Limitations
	- ⚠️ Very short statements (<10 words) may be misclassified as fallacies
	- Example: "I like pizza" incorrectly flagged (not an argument)
	- ⚠️ Circular reasoning occasionally missed (e.g., "healing essences promote healing")
	- ⚠️ Context-dependent arguments may need human review
	- ⚠️ Domain-specific jargon may affect accuracy

	---

	## Model Description

	Fine-tuned DeBERTa-v3-large for binary classification using contrastive learning.

	### Training Data

	Total training examples: 6,529
	- 5,335 examples from LOGIC and CoCoLoFa datasets
	- 1,194 contrastive pairs (oversampled 3x = 3,582 effective examples)

	Contrastive learning approach: High-quality argument pairs where one is valid and one contains a fallacy. The pairs differ only in reasoning quality, teaching the model to distinguish subtle boundaries.

	Test set: 1,130 examples (918 original + 212 contrastive pairs oversampled 2x)

	---

	## Performance

	### Validation Metrics (1,130 examples)

	\| Metric \| Score \|
	\|--------\|-------\|
	\| F1 Score \| 90.8% \|
	\| Accuracy \| 91.1% \|
	\| Precision \| 92.1% \|
	\| Recall \| 89.6% \|
	\| Specificity \| 92.5% \|

	Error Analysis:
	- False Positive Rate: 7.5% (valid arguments incorrectly flagged)
	- False Negative Rate: 10.4% (fallacies missed)

	Confusion Matrix:
	- True Negatives: 529 ✓ (Valid → Valid)
	- False Positives: 43 ✗ (Valid → Fallacy)
	- False Negatives: 58 ✗ (Fallacy → Valid)
	- True Positives: 500 ✓ (Fallacy → Fallacy)

	### Real-World Testing (55 diverse manual cases)

	Accuracy: ~96% (53/55 correct)

	Perfect performance on:
	- Formal syllogisms and deductive logic
	- Mathematical/arithmetic statements
	- Scientific principles (conservation of mass, photosynthesis, aerodynamics)
	- Legal reasoning (contract terms, building codes, citizenship)
	- Policy arguments with evidence

	Correctly identifies edge cases:
	- ✅ Color-blind witness (relevant) vs. shoplifted-as-kid witness (irrelevant)
	- ✅ Structural engineers on bridges (valid authority) vs. physicist on supplements (opinion)
	- ✅ Supply-demand economics (valid principle) vs. Mozart improving machines (false cause)
	- ✅ Large sample generalization vs. anecdotal evidence

	Known errors (2/55):
	- ❌ "I like pizza" → Flagged as fallacy (not an argument)
	- ❌ "Natural essences promote healing" → Classified as valid (circular reasoning)

	---

	## Usage

	```python
	from transformers import pipeline

	# Load model
	classifier = pipeline(
	"text-classification",
	model="Navy0067/Fallacy-detector-binary"
	)

	# Example 1: Valid reasoning (formal logic)
	text1 = "All mammals have backbones. Whales are mammals. Therefore whales have backbones."
	result = classifier(text1)
	# Output: {'label': 'LABEL_0', 'score': 1.00} # LABEL_0 = Valid

	# Example 2: Fallacy (ad hominem)
	text2 = "His economic proposal is wrong because he didn't graduate from college."
	result = classifier(text2)
	# Output: {'label': 'LABEL_1', 'score': 1.00} # LABEL_1 = Fallacy

	# Example 3: Fallacy (slippery slope)
	text3 = "If we allow one streetlamp, they'll install them every five feet and destroy our view of the stars."
	result = classifier(text3)
	# Output: {'label': 'LABEL_1', 'score': 1.00}

	# Example 4: Valid (evidence-based)
	text4 = "The data shows 95% of patients following physical therapy regained mobility, thus the regimen increases recovery chances."
	result = classifier(text4)
	# Output: {'label': 'LABEL_0', 'score': 1.00}

	# Example 5: Edge case - Relevant credential attack (Valid)
	text5 = "The witness's color testimony should be questioned because he was diagnosed with total color blindness."
	result = classifier(text5)
	# Output: {'label': 'LABEL_0', 'score': 1.00}

	# Example 6: Edge case - Irrelevant credential attack (Fallacy)
	text6 = "The witness's testimony should be questioned because he shoplifted a candy bar at age twelve."
	result = classifier(text6)
	# Output: {'label': 'LABEL_1', 'score': 1.00}
	````
	----

	## Label Mapping:

	- LABEL_0 = Valid reasoning (no fallacy detected)

	- LABEL_1 = Contains fallacy

	### Training Details
	Base Model: microsoft/deberta-v3-large
	Training Configuration:

	Epochs: 6

	Batch size: 4 (effective: 16 with gradient accumulation)

	Learning rate: 1e-5

	Optimizer: AdamW with weight decay 0.01

	Scheduler: Cosine with 10% warmup

	Max sequence length: 256 tokens

	FP16 training enabled

	Hardware: Kaggle P100 GPU (~82 minutes training time)

	Data Strategy:

	Original LOGIC/CoCoLoFa data (81.7% of training set)

	Contrastive pairs oversampled 3x (emphasizes boundary learning)

	## Dataset

	The contrastive training pairs used for fine-tuning this model are available at:
	[Navy0067/contrastive-pairs-for-logical-fallacy](https://huggingface.co/datasets/Navy0067/contrastive-pairs-for-logical-fallacy)

	## Contact
	Author: Navyansh Singh

	Hugging Face: @Navy0067

	Email: Navyansh24102@iiitnr.edu.in


	## Citation

	If you use this model in your research, please cite it as:

	```bibtex
	@misc{singh2026fallacy,
	author = {Navyansh Singh},
	title = {Logical Fallacy Detector: Binary Classification with Contrastive Learning},
	year = {2026},
	publisher = {Hugging Face},
	journal = {Hugging Face Model Hub},
	url = {https://huggingface.co/Navy0067/Fallacy-detector-binary}
	}