|
|
--- |
|
|
license: apache-2.0 |
|
|
pipeline_tag: text-classification |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
tags: |
|
|
- PyTorch |
|
|
--- |
|
|
|
|
|
# π€ BERT for Fake News Detection (Fakeddit + BLIP Captions) |
|
|
|
|
|
This model is a fine-tuned [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) on the **Fakeddit** dataset. |
|
|
It combines post text with **image captions generated** by [`Salesforce/blip-image-captioning-base`](https://huggingface.co/Salesforce/blip-image-captioning-base), rather than using raw image features. |
|
|
|
|
|
## π§ Model Summary |
|
|
|
|
|
- **Architecture**: BERT (uncased) |
|
|
- **Inputs**: `[CLS] post text, BLIP image caption [SEP]` |
|
|
- **Task**: Multi-class classification (6 labels) |
|
|
- **Dataset**: Fakeddit (Nakamura et al., 2020) |
|
|
- **Captioning Model**: `Salesforce/blip-image-captioning-base` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Results |
|
|
|
|
|
| Approach | Accuracy | Macro F1-Score | |
|
|
|------------------|----------|----------------| |
|
|
| Text + Caption | **0.87** | **0.83** | |
|
|
|
|
|
β‘οΈ Using captions instead of raw image features leads to state-of-the-art performance on Fakeddit, with simpler input and no vision backbone needed during inference. |
|
|
|
|
|
--- |
|
|
|
|
|
## π References |
|
|
|
|
|
This model builds on the following works: |
|
|
|
|
|
- **Fakeddit dataset**: [Nakamura et al., (2020)](https://arxiv.org/abs/1911.03854) β A multimodal fake news dataset |
|
|
- **BLIP captioning model**: [Li et al. (2022)](https://arxiv.org/abs/2201.12086) β Vision-language pretraining with BLIP |
|
|
- **BERT base model**: [Devlin et al. (2019)](https://arxiv.org/abs/1810.04805) β Pretrained transformer for text understanding |