--- language: en library_name: transformers pipeline_tag: text-classification tags: - text-classification - sequence-classification - roberta - distilroberta - climate-change - logical-fallacy-detection - nlp license: apache-2.0 model-index: - name: climate-fallacy-roberta results: - task: type: text-classification name: Climate logical fallacy classification dataset: name: Climate subset of Tariq60/fallacy-detection type: custom split: test metrics: - name: Accuracy type: accuracy value: 0.24 - name: Macro F1 type: f1 value: 0.20 - name: Weighted F1 type: f1 value: 0.24 --- # Climate Logical Fallacy Classifier (DistilRoBERTa) This model is a **DistilRoBERTa**–based text classification model fine-tuned to detect **logical fallacies in climate-related text**. It predicts one of 11 logical fallacy labels (including “NO_FALLACY”) for a given sentence or short paragraph. The model was trained as part of an academic NLP project on _“Automated Detection of Logical Fallacies in Climate Change Social Media Posts using Small Language Models (SLMs)”_. ## Model Details - **Base model**: `distilroberta-base` - **Architecture**: DistilRoBERTa (Transformer encoder, 6 layers) - **Task**: Multi-class text classification - **Number of classes**: 11 - **Language**: English - **Framework**: Transformers ### Label Set The model is trained to predict the following labels: 1. `CHERRY_PICKING` 2. `EVADING_THE_BURDEN_OF_PROOF` 3. `FALSE_ANALOGY` 4. `FALSE_AUTHORITY` 5. `FALSE_CAUSE` 6. `HASTY_GENERALISATION` 7. `NO_FALLACY` 8. `POST_HOC` 9. `RED_HERRINGS` 10. `STRAWMAN` 11. `VAGUENESS` `id2label` / `label2id` mappings are stored in the model config and are consistent with the training code. ## 📚 Training Data The model was fine-tuned on the **climate subset** of the open-source dataset from: > Tariq60 – *fallacy-detection* repository > https://github.com/Tariq60/fallacy-detection Only the **climate** portion of the dataset was used, with the standard split: - `train/` – training examples - `dev/` – validation examples - `test/` – held-out evaluation set Each example includes: - The climate-related text segment - A manually assigned fallacy label (or `No fallacy`) ### Preprocessing - Texts were lower-cased and cleaned using a light `basic_clean` function: - Stripping extra whitespace - Normalising some punctuation - Some classes were **minority labels** (few examples), so basic **class balancing** was applied via up-sampling in the training set. - NaN or empty texts were dropped before training. ## Training Procedure - **Base model**: `distilroberta-base` - **Optimizer**: AdamW (via `Trainer`) - **Learning rate**: 2e-5 - **Batch size**: 16 - **Max sequence length**: 128–256 tokens - **Epochs**: 10 - **Weight decay**: 0.01 - **Loss function**: Cross-entropy, optionally with class weights to mitigate class imbalance - **Validation split**: 80/20 stratified split of the training data ## Implementation used: - `AutoTokenizer` - `AutoModelForSequenceClassification` - `TrainingArguments` - `Trainer` from the Transformers library. ## Evaluation Evaluation was done on the **held-out climate test set** from the dataset. **Metrics (multi-class):** - **Accuracy** ≈ 0.24 - **Macro F1** ≈ 0.20 - **Weighted F1** ≈ 0.24 These values are **baseline experimental results** on a relatively small and imbalanced dataset. They should be interpreted as *preliminary research numbers*, not as production-ready performance. Different random seeds, data balancing strategies, or more aggressive hyperparameter tuning can change these numbers. ## Intended Use ### Primary Use - Research and experimentation on: - Automated detection of logical fallacies in climate discourse - Comparing traditional baselines (TF-IDF + SVM) vs. Transformer-based models - Building educational tools that flag potential fallacies in climate arguments ### Suitable Scenarios - Analyzing **short climate-related social media posts** - Demonstration / teaching examples on: - Argumentation quality - Climate misinformation - Explainable NLP (combined with a small language model explainer, e.g. FLAN-T5) - ## Limitations & Ethical Considerations ### Limitations - **Small dataset**: Training data is limited in size, especially for rarer fallacy types. - **Class imbalance**: Some fallacies occur far less frequently, which affects per-class F1 scores. - **Modest performance**: Overall accuracy and macro F1 are relatively low. The model should be treated as an exploratory research artifact, not a production system. - **Domain specificity**: The model is trained only on **climate** discourse; performance on other topics (e.g. politics, health) is unknown and likely poor. ### Ethical Considerations - Predictions are **probabilistic**, not definitive judgments of truth or deception. - The model can be **wrong or over-confident**, especially on borderline or nuanced arguments. - It should **not** be used for automated moderation, censorship, or any high-stakes decision-making without strong human oversight. ## How to Integration with Explanatory SLM In the associated project, this classifier is combined with a small language model (e.g., google/flan-t5-small) to generate natural-language explanations of the predicted fallacy label: What the fallacy means in simple terms Why the input text might be an example This setup is used in a Streamlit app: Users enter a climate-related argument The model predicts a fallacy label FLAN-T5 generates a short explanation ## Citation If you use this model in academic work, you can cite it as: Kyeremeh, F. (2025). Climate Logical Fallacy Classifier (DistilRoBERTa). Hugging Face. Model: SteadyHands/climate-fallacy-roberta. And also consider citing the original dataset author(s): Tariq60. fallacy-detection GitHub repository. https://github.com/Tariq60/fallacy-detection ## Acknowledgements Base model: distilroberta-base by Hugging Face Dataset: Climate subset from Tariq60’s fallacy-detection repository ## Libraries: Transformers Datasets scikit-learn ## Project context: Master ’s-level NLP / Data Science coursework on Small Language Models and explainable NLP. ## How to Use ### Python Example (Logits → Label) ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model_id = "SteadyHands/climate-fallacy-roberta" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForSequenceClassification.from_pretrained(model_id) text = "Climate has always changed in the past, so current warming can't be caused by humans." inputs = tokenizer( text, return_tensors="pt", truncation=True, padding="max_length", max_length=256, ) with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probs = torch.softmax(logits, dim=-1)[0].tolist() pred_id = int(torch.argmax(logits, dim=-1).item()) id2label = model.config.id2label pred_label = id2label[str(pred_id)] if isinstance(id2label, dict) else id2label[pred_id] print("Text:", text) print("Predicted label:", pred_label) print("Probabilities:", probs) Using the Transformers Pipeline ```python from transformers import pipeline clf = pipeline( "text-classification", model="SteadyHands/climate-fallacy-roberta", top_k=None, # set top_k=3 to see top-3 fallacies ) text = "Temperatures dropped this winter, so global warming must be a hoax." outputs = clf(text) print(outputs)