|
|
--- |
|
|
license: mit |
|
|
language: en |
|
|
tags: |
|
|
- text-classification |
|
|
- fake-news-detection |
|
|
- transformer-ensemble |
|
|
- bert |
|
|
- deberta |
|
|
library_name: transformers |
|
|
datasets: |
|
|
- custom |
|
|
model_name: BertAndDeberta |
|
|
--- |
|
|
|
|
|
# BertAndDeberta + ViT — Transformer Ensemble for Fake News Detection |
|
|
|
|
|
This repository hosts a multimodal fake news detection system that combines **BERT**, **DeBERTa**, and **ViT** models. The *BERT* and *DeBERTa* models handle textual data to classify news as either real or fake, while the *ViT* model is trained separately to detect AI-generated vs real images, helping assess the authenticity of visual content. |
|
|
|
|
|
|
|
|
## Text Models — Fake News Detection |
|
|
- **Architecture**: Ensemble of BERT-base and DeBERTa-base |
|
|
- **Task**: Binary Text Classification (`REAL` vs `FAKE`) |
|
|
- **Training Framework**: PyTorch using 🤗 Transformers |
|
|
- **License**: MIT |
|
|
|
|
|
|
|
|
## Vision Model — AI-Generated Image Detection |
|
|
- **Architecture**: Vision Transformer (ViT-base, vit-base-patch16-224) |
|
|
- **Task**: Binary Image Classification ('REAL' vs 'AI-GENERATED') |
|
|
- **Training Framework**: TensorFlow/Keras or PyTorch using Transformers |
|
|
- **License**: MIT |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The dataset is a custom collection combining: |
|
|
- News content (title + body) |
|
|
- Labels: `0 = FAKE`, `1 = REAL` |
|
|
|
|
|
## ❗Disclaimer |
|
|
- This project is for **educational and experimental purposes only**. |
|
|
- It is **not suitable for real-world fact-checking** or serious decision-making. |
|
|
- The model uses a simple binary classifier and does not verify factual correctness. |
|
|
- Model may sometimes misclassify text with unclear or misleading context, and images that are abstract, artistic, or difficult to distinguish from real content. |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("fauxNeuz/BertAndDeberta") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("fauxNeuz/BertAndDeberta") |
|
|
|
|
|
text = "Government confirms policy updates in healthcare sector." |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
outputs = model(**inputs) |
|
|
pred = outputs.logits.argmax(dim=-1).item() |
|
|
print("REAL" if pred == 1 else "FAKE") |
|
|
|