---
language: en
library_name: transformers
pipeline_tag: text-classification
tags:
  - text-classification
  - sequence-classification
  - roberta
  - distilroberta
  - climate-change
  - logical-fallacy-detection
  - nlp
license: apache-2.0
model-index:
  - name: climate-fallacy-roberta
    results:
      - task:
          type: text-classification
          name: Climate logical fallacy classification
        dataset:
          name: Climate subset of Tariq60/fallacy-detection
          type: custom
          split: test
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.24
          - name: Macro F1
            type: f1
            value: 0.20
          - name: Weighted F1
            type: f1
            value: 0.24
---

# Climate Logical Fallacy Classifier (DistilRoBERTa)

This model is a **DistilRoBERTa**–based text classification model fine-tuned to detect **logical fallacies in climate-related text**.  
It predicts one of 11 logical fallacy labels (including “NO_FALLACY”) for a given sentence or short paragraph.

The model was trained as part of an academic NLP project on _“Automated Detection of Logical Fallacies in Climate Change Social Media Posts using Small Language Models (SLMs)”_.

## Model Details

- **Base model**: `distilroberta-base`
- **Architecture**: DistilRoBERTa (Transformer encoder, 6 layers)
- **Task**: Multi-class text classification
- **Number of classes**: 11
- **Language**: English
- **Framework**:  Transformers

### Label Set

The model is trained to predict the following labels:

1. `CHERRY_PICKING`
2. `EVADING_THE_BURDEN_OF_PROOF`
3. `FALSE_ANALOGY`
4. `FALSE_AUTHORITY`
5. `FALSE_CAUSE`
6. `HASTY_GENERALISATION`
7. `NO_FALLACY`
8. `POST_HOC`
9. `RED_HERRINGS`
10. `STRAWMAN`
11. `VAGUENESS`

`id2label` / `label2id` mappings are stored in the model config and are consistent with the training code.

## 📚 Training Data

The model was fine-tuned on the **climate subset** of the open-source dataset from:

> Tariq60 – *fallacy-detection* repository  
> https://github.com/Tariq60/fallacy-detection

Only the **climate** portion of the dataset was used, with the standard split:

- `train/` – training examples
- `dev/` – validation examples
- `test/` – held-out evaluation set

Each example includes:

- The climate-related text segment
- A manually assigned fallacy label (or `No fallacy`)

### Preprocessing

- Texts were lower-cased and cleaned using a light `basic_clean` function:
  - Stripping extra whitespace
  - Normalising some punctuation
- Some classes were **minority labels** (few examples), so basic **class balancing** was applied via up-sampling in the training set.
- NaN or empty texts were dropped before training.

## Training Procedure

- **Base model**: `distilroberta-base`
- **Optimizer**: AdamW (via `Trainer`)
- **Learning rate**: 2e-5
- **Batch size**: 16
- **Max sequence length**: 128–256 tokens
- **Epochs**: 10
- **Weight decay**: 0.01
- **Loss function**: Cross-entropy, optionally with class weights to mitigate class imbalance
- **Validation split**: 80/20 stratified split of the training data

## Implementation used:

- `AutoTokenizer`
- `AutoModelForSequenceClassification`
- `TrainingArguments`
- `Trainer`

from the Transformers library.

## Evaluation

Evaluation was done on the **held-out climate test set** from the dataset.

**Metrics (multi-class):**

- **Accuracy** ≈ 0.24  
- **Macro F1** ≈ 0.20  
- **Weighted F1** ≈ 0.24  

These values are **baseline experimental results** on a relatively small and imbalanced dataset. They should be interpreted as *preliminary research numbers*, not as production-ready performance.

Different random seeds, data balancing strategies, or more aggressive hyperparameter tuning can change these numbers.

## Intended Use

### Primary Use

- Research and experimentation on:
  - Automated detection of logical fallacies in climate discourse
  - Comparing traditional baselines (TF-IDF + SVM) vs. Transformer-based models
  - Building educational tools that flag potential fallacies in climate arguments

### Suitable Scenarios

- Analyzing **short climate-related social media posts**  
- Demonstration / teaching examples on:
  - Argumentation quality
  - Climate misinformation
  - Explainable NLP (combined with a small language model explainer, e.g. FLAN-T5)
  - 
## Limitations & Ethical Considerations

### Limitations

- **Small dataset**: Training data is limited in size, especially for rarer fallacy types.
- **Class imbalance**: Some fallacies occur far less frequently, which affects per-class F1 scores.
- **Modest performance**: Overall accuracy and macro F1 are relatively low. The model should be treated as an exploratory research artifact, not a production system.
- **Domain specificity**: The model is trained only on **climate** discourse; performance on other topics (e.g. politics, health) is unknown and likely poor.

### Ethical Considerations

- Predictions are **probabilistic**, not definitive judgments of truth or deception.
- The model can be **wrong or over-confident**, especially on borderline or nuanced arguments.
- It should **not** be used for automated moderation, censorship, or any high-stakes decision-making without strong human oversight.

##  How to Integration with Explanatory SLM

In the associated project, this classifier is combined with a small language model (e.g., google/flan-t5-small) to generate natural-language explanations of the predicted fallacy label:

What the fallacy means in simple terms

Why the input text might be an example

This setup is used in a Streamlit app:

Users enter a climate-related argument

The model predicts a fallacy label

FLAN-T5 generates a short explanation

## Citation

If you use this model in academic work, you can cite it as:

Kyeremeh, F. (2025). Climate Logical Fallacy Classifier (DistilRoBERTa). Hugging Face.
Model: SteadyHands/climate-fallacy-roberta.

And also consider citing the original dataset author(s):

Tariq60. fallacy-detection GitHub repository.
https://github.com/Tariq60/fallacy-detection

## Acknowledgements

Base model: distilroberta-base by Hugging Face

Dataset: Climate subset from Tariq60’s fallacy-detection repository

## Libraries:

Transformers

 Datasets

scikit-learn

## Project context: 

Master ’s-level NLP / Data Science coursework on Small Language Models and explainable NLP.

##  How to Use

### Python Example (Logits → Label)

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "SteadyHands/climate-fallacy-roberta"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "Climate has always changed in the past, so current warming can't be caused by humans."

inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    padding="max_length",
    max_length=256,
)

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
probs = torch.softmax(logits, dim=-1)[0].tolist()
pred_id = int(torch.argmax(logits, dim=-1).item())

id2label = model.config.id2label
pred_label = id2label[str(pred_id)] if isinstance(id2label, dict) else id2label[pred_id]

print("Text:", text)
print("Predicted label:", pred_label)
print("Probabilities:", probs)


Using the Transformers Pipeline

```python
from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="SteadyHands/climate-fallacy-roberta",
    top_k=None,  # set top_k=3 to see top-3 fallacies
)

text = "Temperatures dropped this winter, so global warming must be a hoax."
outputs = clf(text)

print(outputs)