heavyhelium's picture
Initial upload: ELECTRA-base fine-tuned on LOGIC (zhijin splits)
8327c1f verified
---
license: apache-2.0
base_model: google/electra-base-discriminator
tags:
- text-classification
- fallacy-detection
- logical-fallacy
language:
- en
datasets:
- logical-fallacy-detection
metrics:
- accuracy
- f1
pipeline_tag: text-classification
---
# ELECTRA-base fine-tuned for logical fallacy classification
13-way classifier fine-tuned on the LOGIC dataset from Jin et al. 2022, *Logical Fallacy Detection* ([arXiv:2202.13758](https://arxiv.org/abs/2202.13758)).
Base model: [`google/electra-base-discriminator`](https://huggingface.co/google/electra-base-discriminator).
## Labels
```
ad hominem, ad populum, appeal to emotion, circular reasoning, equivocation,
fallacy of credibility, fallacy of extension, fallacy of logic,
fallacy of relevance, false causality, false dilemma, faulty generalization,
intentional
```
## Training
- **Data:** LOGIC train split, 1849 examples / 13 classes (zhijin/zhijingjin splits; dev 300, test 300).
- **Hyperparams:** 3 epochs, lr 5e-5, weight decay 0.01, warmup ratio 0.1, batch 8, max_len 128, seed 42, best checkpoint by val macro-F1.
- **Hardware:** CPU (4 threads), ~23 min wall.
## Evaluation
### In-domain (LOGIC test, n=300)
| metric | value |
|---|---|
| accuracy | 0.643 |
| macro-F1 | 0.552 |
| weighted-F1 | 0.625 |
Comparable to the paper's plain-ELECTRA baseline (~0.533 F1 in Table 3).
### Zero-shot transfer (LOGICCLIMATE, n=1312)
| metric | value |
|---|---|
| accuracy | 0.210 |
| macro-F1 | 0.183 |
Sharp drop on out-of-domain transfer, in line with the paper's Table 4 findings (their best model drops from 0.588 to 0.272 F1).
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
repo = "heavyhelium/electra-base-logic-fallacy"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo).eval()
text = "Everyone I know drives a Toyota, so Toyotas must be the best cars."
enc = tok(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
pred_id = model(**enc).logits.argmax(-1).item()
print(model.config.id2label[pred_id]) # -> ad populum
```
## Limitations
- **Poor cross-domain generalization.** Drops ~0.37 macro-F1 from educational text (LOGIC) to climate-change news (LOGICCLIMATE). Do not trust predictions far from the training domain.
- **Data imbalance bias.** Rare classes (`equivocation`, ~2% of training data) are under-predicted; `equivocation` test F1 is 0.00 in both in-domain and transfer settings.
- **Short-text bias.** Training examples are mostly 1-2 sentence educational quiz items (median ~100 characters). Longer or structurally different text may degrade.
- **Single-label.** Each input is forced into exactly one of 13 classes; real-world text often exhibits multiple fallacies or none.
## Citation
```bibtex
@inproceedings{jin-etal-2022-logical,
title = "Logical Fallacy Detection",
author = "Jin, Zhijing and Lalwani, Abhinav and Vaidhya, Tejas and Shen, Xiaoyu and Ding, Yiwen and Lyu, Zhiheng and Sachan, Mrinmaya and Mihalcea, Rada and Sch{\"o}lkopf, Bernhard",
booktitle = "Findings of EMNLP 2022",
year = "2022",
url = "https://arxiv.org/abs/2202.13758",
}
```