dpmendez's picture
Create README.md
156ed87 verified
metadata
datasets:
  - dpmendez/environmental-misinformation
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased

This model is a DistilBERT-based transformer fine-tuned for climate misinformation classification. It predicts the veracity of individual climate-related claims using contextualized language representations.

The model was trained on a dataset combining:

  • Climate Fever
  • Science Feedback fact-checked claims

Model Details

  • Model type: DistilBERT (distilbert-base-uncased)
  • Task: Sequence classification
  • Input: Single climate-related claim (text)
  • Output: Claim label probabilities
  • Framework: Hugging Face Transformers
  • Model weights: Stored in model.safetensors

Labels

Label Description
LIKELY_TRUE Claim is consistent with scientific consensus
LIKELY_FALSE Claim contradicts scientific consensus

Label mappings are defined in config.json and label_map.json.

Training Procedure

  • Fine-tuned from distilbert-base-uncased
  • Cross-entropy loss
  • Class imbalance handled via training strategy (no oversampling)
  • Inference threshold tuned post-training to decrease cost function (less false positives is better)

The selected inference threshold is stored in threshold.json.