---
license: apache-2.0
base_model: google/electra-base-discriminator
tags:
  - text-classification
  - fallacy-detection
  - logical-fallacy
language:
  - en
datasets:
  - logical-fallacy-detection
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
---

# ELECTRA-base fine-tuned for logical fallacy classification

13-way classifier fine-tuned on the LOGIC dataset from Jin et al. 2022, *Logical Fallacy Detection* ([arXiv:2202.13758](https://arxiv.org/abs/2202.13758)).

Base model: [`google/electra-base-discriminator`](https://huggingface.co/google/electra-base-discriminator).

## Labels

```
ad hominem, ad populum, appeal to emotion, circular reasoning, equivocation,
fallacy of credibility, fallacy of extension, fallacy of logic,
fallacy of relevance, false causality, false dilemma, faulty generalization,
intentional
```

## Training

- **Data:** LOGIC train split, 1849 examples / 13 classes (zhijin/zhijingjin splits; dev 300, test 300).
- **Hyperparams:** 3 epochs, lr 5e-5, weight decay 0.01, warmup ratio 0.1, batch 8, max_len 128, seed 42, best checkpoint by val macro-F1.
- **Hardware:** CPU (4 threads), ~23 min wall.

## Evaluation

### In-domain (LOGIC test, n=300)

| metric | value |
|---|---|
| accuracy | 0.643 |
| macro-F1 | 0.552 |
| weighted-F1 | 0.625 |

Comparable to the paper's plain-ELECTRA baseline (~0.533 F1 in Table 3).

### Zero-shot transfer (LOGICCLIMATE, n=1312)

| metric | value |
|---|---|
| accuracy | 0.210 |
| macro-F1 | 0.183 |

Sharp drop on out-of-domain transfer, in line with the paper's Table 4 findings (their best model drops from 0.588 to 0.272 F1).

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

repo = "heavyhelium/electra-base-logic-fallacy"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo).eval()

text = "Everyone I know drives a Toyota, so Toyotas must be the best cars."
enc = tok(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
    pred_id = model(**enc).logits.argmax(-1).item()
print(model.config.id2label[pred_id])  # -> ad populum
```

## Limitations

- **Poor cross-domain generalization.** Drops ~0.37 macro-F1 from educational text (LOGIC) to climate-change news (LOGICCLIMATE). Do not trust predictions far from the training domain.
- **Data imbalance bias.** Rare classes (`equivocation`, ~2% of training data) are under-predicted; `equivocation` test F1 is 0.00 in both in-domain and transfer settings.
- **Short-text bias.** Training examples are mostly 1-2 sentence educational quiz items (median ~100 characters). Longer or structurally different text may degrade.
- **Single-label.** Each input is forced into exactly one of 13 classes; real-world text often exhibits multiple fallacies or none.

## Citation

```bibtex
@inproceedings{jin-etal-2022-logical,
    title = "Logical Fallacy Detection",
    author = "Jin, Zhijing and Lalwani, Abhinav and Vaidhya, Tejas and Shen, Xiaoyu and Ding, Yiwen and Lyu, Zhiheng and Sachan, Mrinmaya and Mihalcea, Rada and Sch{\"o}lkopf, Bernhard",
    booktitle = "Findings of EMNLP 2022",
    year = "2022",
    url = "https://arxiv.org/abs/2202.13758",
}
```