SteadyHands's picture
Update README.md
1dcd5ef verified
---
language: en
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- sequence-classification
- roberta
- distilroberta
- climate-change
- logical-fallacy-detection
- nlp
license: apache-2.0
model-index:
- name: climate-fallacy-roberta
results:
- task:
type: text-classification
name: Climate logical fallacy classification
dataset:
name: Climate subset of Tariq60/fallacy-detection
type: custom
split: test
metrics:
- name: Accuracy
type: accuracy
value: 0.24
- name: Macro F1
type: f1
value: 0.20
- name: Weighted F1
type: f1
value: 0.24
---
# Climate Logical Fallacy Classifier (DistilRoBERTa)
This model is a **DistilRoBERTa**–based text classification model fine-tuned to detect **logical fallacies in climate-related text**.
It predicts one of 11 logical fallacy labels (including “NO_FALLACY”) for a given sentence or short paragraph.
The model was trained as part of an academic NLP project on _“Automated Detection of Logical Fallacies in Climate Change Social Media Posts using Small Language Models (SLMs)”_.
## Model Details
- **Base model**: `distilroberta-base`
- **Architecture**: DistilRoBERTa (Transformer encoder, 6 layers)
- **Task**: Multi-class text classification
- **Number of classes**: 11
- **Language**: English
- **Framework**: Transformers
### Label Set
The model is trained to predict the following labels:
1. `CHERRY_PICKING`
2. `EVADING_THE_BURDEN_OF_PROOF`
3. `FALSE_ANALOGY`
4. `FALSE_AUTHORITY`
5. `FALSE_CAUSE`
6. `HASTY_GENERALISATION`
7. `NO_FALLACY`
8. `POST_HOC`
9. `RED_HERRINGS`
10. `STRAWMAN`
11. `VAGUENESS`
`id2label` / `label2id` mappings are stored in the model config and are consistent with the training code.
## 📚 Training Data
The model was fine-tuned on the **climate subset** of the open-source dataset from:
> Tariq60 – *fallacy-detection* repository
> https://github.com/Tariq60/fallacy-detection
Only the **climate** portion of the dataset was used, with the standard split:
- `train/` – training examples
- `dev/` – validation examples
- `test/` – held-out evaluation set
Each example includes:
- The climate-related text segment
- A manually assigned fallacy label (or `No fallacy`)
### Preprocessing
- Texts were lower-cased and cleaned using a light `basic_clean` function:
- Stripping extra whitespace
- Normalising some punctuation
- Some classes were **minority labels** (few examples), so basic **class balancing** was applied via up-sampling in the training set.
- NaN or empty texts were dropped before training.
## Training Procedure
- **Base model**: `distilroberta-base`
- **Optimizer**: AdamW (via `Trainer`)
- **Learning rate**: 2e-5
- **Batch size**: 16
- **Max sequence length**: 128–256 tokens
- **Epochs**: 10
- **Weight decay**: 0.01
- **Loss function**: Cross-entropy, optionally with class weights to mitigate class imbalance
- **Validation split**: 80/20 stratified split of the training data
## Implementation used:
- `AutoTokenizer`
- `AutoModelForSequenceClassification`
- `TrainingArguments`
- `Trainer`
from the Transformers library.
## Evaluation
Evaluation was done on the **held-out climate test set** from the dataset.
**Metrics (multi-class):**
- **Accuracy** ≈ 0.24
- **Macro F1** ≈ 0.20
- **Weighted F1** ≈ 0.24
These values are **baseline experimental results** on a relatively small and imbalanced dataset. They should be interpreted as *preliminary research numbers*, not as production-ready performance.
Different random seeds, data balancing strategies, or more aggressive hyperparameter tuning can change these numbers.
## Intended Use
### Primary Use
- Research and experimentation on:
- Automated detection of logical fallacies in climate discourse
- Comparing traditional baselines (TF-IDF + SVM) vs. Transformer-based models
- Building educational tools that flag potential fallacies in climate arguments
### Suitable Scenarios
- Analyzing **short climate-related social media posts**
- Demonstration / teaching examples on:
- Argumentation quality
- Climate misinformation
- Explainable NLP (combined with a small language model explainer, e.g. FLAN-T5)
-
## Limitations & Ethical Considerations
### Limitations
- **Small dataset**: Training data is limited in size, especially for rarer fallacy types.
- **Class imbalance**: Some fallacies occur far less frequently, which affects per-class F1 scores.
- **Modest performance**: Overall accuracy and macro F1 are relatively low. The model should be treated as an exploratory research artifact, not a production system.
- **Domain specificity**: The model is trained only on **climate** discourse; performance on other topics (e.g. politics, health) is unknown and likely poor.
### Ethical Considerations
- Predictions are **probabilistic**, not definitive judgments of truth or deception.
- The model can be **wrong or over-confident**, especially on borderline or nuanced arguments.
- It should **not** be used for automated moderation, censorship, or any high-stakes decision-making without strong human oversight.
## How to Integration with Explanatory SLM
In the associated project, this classifier is combined with a small language model (e.g., google/flan-t5-small) to generate natural-language explanations of the predicted fallacy label:
What the fallacy means in simple terms
Why the input text might be an example
This setup is used in a Streamlit app:
Users enter a climate-related argument
The model predicts a fallacy label
FLAN-T5 generates a short explanation
## Citation
If you use this model in academic work, you can cite it as:
Kyeremeh, F. (2025). Climate Logical Fallacy Classifier (DistilRoBERTa). Hugging Face.
Model: SteadyHands/climate-fallacy-roberta.
And also consider citing the original dataset author(s):
Tariq60. fallacy-detection GitHub repository.
https://github.com/Tariq60/fallacy-detection
## Acknowledgements
Base model: distilroberta-base by Hugging Face
Dataset: Climate subset from Tariq60’s fallacy-detection repository
## Libraries:
Transformers
Datasets
scikit-learn
## Project context:
Master ’s-level NLP / Data Science coursework on Small Language Models and explainable NLP.
## How to Use
### Python Example (Logits → Label)
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "SteadyHands/climate-fallacy-roberta"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
text = "Climate has always changed in the past, so current warming can't be caused by humans."
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
padding="max_length",
max_length=256,
)
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.softmax(logits, dim=-1)[0].tolist()
pred_id = int(torch.argmax(logits, dim=-1).item())
id2label = model.config.id2label
pred_label = id2label[str(pred_id)] if isinstance(id2label, dict) else id2label[pred_id]
print("Text:", text)
print("Predicted label:", pred_label)
print("Probabilities:", probs)
Using the Transformers Pipeline
```python
from transformers import pipeline
clf = pipeline(
"text-classification",
model="SteadyHands/climate-fallacy-roberta",
top_k=None, # set top_k=3 to see top-3 fallacies
)
text = "Temperatures dropped this winter, so global warming must be a hoax."
outputs = clf(text)
print(outputs)