Create README.md

156ed87 verified 2 months ago

1.37 kB

datasets:
  - dpmendez/environmental-misinformation
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased

This model is a DistilBERT-based transformer fine-tuned for climate misinformation classification. It predicts the veracity of individual climate-related claims using contextualized language representations.

The model was trained on a dataset combining:

Climate Fever
Science Feedback fact-checked claims

Model Details

Model type: DistilBERT (distilbert-base-uncased)
Task: Sequence classification
Input: Single climate-related claim (text)
Output: Claim label probabilities
Framework: Hugging Face Transformers
Model weights: Stored in model.safetensors

Labels

Label	Description
`LIKELY_TRUE`	Claim is consistent with scientific consensus
`LIKELY_FALSE`	Claim contradicts scientific consensus

Label mappings are defined in config.json and label_map.json.

Training Procedure

Fine-tuned from distilbert-base-uncased
Cross-entropy loss
Class imbalance handled via training strategy (no oversampling)
Inference threshold tuned post-training to decrease cost function (less false positives is better)

The selected inference threshold is stored in threshold.json.