File size: 2,118 Bytes
3c01d9c
 
 
 
 
 
 
 
 
 
 
 
 
 
8528dce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
title: FakeNewsDetector
emoji: 🐠
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: mit
short_description: BERT1+BERT2
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference


---
title: NewFakeNewsModel
emoji: ⚡
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
license: mit
short_description: wrk on prgress
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference


# Fake News Classifier (BERT-based)

This project detects whether a news article is real or fake using a fine-tuned BERT model for binary text classification.

---

##  Disclaimer

- This project is for **educational and experimental purposes only**.
- It is **not suitable for real-world fact-checking** or serious decision-making.
- The model uses a simple binary classifier and does not verify factual correctness.

---

##  Project Overview

This fake news classifier was built as part of a research internship to:

- Learn how to fine-tune transformer models on classification tasks
- Practice handling class imbalance using weighted loss
- Deploy models using Hugging Face-compatible APIs

---

##  How It Works

- A BERT-based model (`bert-base-uncased`) was fine-tuned on a labeled dataset of news articles.
- Input text is tokenized using `BertTokenizer`.
- A custom Trainer with class-weighted loss was used to handle class imbalance.
- Outputs are binary: **0 = FAKE**, **1 = REAL**.

### Training Details

- Model: `BertForSequenceClassification`
- Epochs: 4
- Batch size: 8
- Learning rate: 2e-5
- Optimizer: AdamW (via Hugging Face Trainer)
- Evaluation Metrics: Accuracy, F1-score, Precision, Recall

---

## 🛠 Libraries Used

- `transformers`
- `datasets`
- `torch`
- `scikit-learn`
- `pandas`
- `nltk` (optional preprocessing)

---

## 📦 Installation & Running

```bash
pip install -r requirements.txt
python app.py
```

Or run the training script in a notebook or script environment if you're using Google Colab or Jupyter.

---