devxyasir's picture
Update README.md
d8d50b7 verified
---
license: mit
language:
- en
tags:
- email
- spam
- spamdetection
---
# πŸ“© Spam Detection Neural Network (PyTorch)
[![Python](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/)
[![PyTorch](https://img.shields.io/badge/pytorch-2.1-red.svg)](https://pytorch.org/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
A **simple, real-world spam detection neural network** built from scratch in **PyTorch**.
This model classifies SMS / short text messages as **Spam** or **Ham (Not Spam)**.
The project is **small, easy to understand, and perfect for learning**.
You can fork it, fine-tune it, and use it as a **starting point for your own projects**.
---
## 🧠 Model Overview
- **Framework:** PyTorch
- **Architecture:** Fully Connected Neural Network (MLP)
- **Input:** Bag-of-Words text vectors
- **Output:** Binary classification (Spam / Ham)
- **Training:** From scratch, small dataset (~5,500 messages)
> ⚠️ Note: The dataset is intentionally small to keep things simple.
> You are encouraged to **fork the repo, add more data, and fine-tune the model**.
---
## πŸ“‚ Repository Structure
```
.
β”œβ”€β”€ spam_nn.pth # Trained PyTorch model weights
β”œβ”€β”€ vectorizer.pkl # CountVectorizer for text preprocessing
β”œβ”€β”€ model.py # Neural network architecture
β”œβ”€β”€ config.json # Model configuration
β”œβ”€β”€ inference.py # Inference / prediction script
β”œβ”€β”€ README.md # Documentation
````
---
## πŸš€ Usage
### Load Model
```python
import torch
from model import SpamNN
import pickle
# Load model architecture + weights
model = SpamNN()
model.load_state_dict(torch.load("spam_nn.pth"))
model.eval()
# Load vectorizer
with open("vectorizer.pkl", "rb") as f:
vectorizer = pickle.load(f)
````
### Predict Messages
```python
def predict(text):
vec = vectorizer.transform([text]).toarray()
vec = torch.tensor(vec, dtype=torch.float32)
with torch.no_grad():
output = model(vec)
return "Spam" if output.item() > 0.35 else "Ham"
# Example
print(predict("Congratulations! You won $1000. Click now!"))
```
---
## πŸ”§ Training & Fine-Tuning
The model can be **improved and fine-tuned** by:
* Adding more data (larger SMS datasets)
* Increasing n-grams (`ngram_range=(1,2)`)
* Adjusting class weights in `BCEWithLogitsLoss`
* Training with more epochs
* Using embeddings or LSTM for contextual understanding
πŸ’‘ **Fork this repo and experiment freely**. Make it your own!
---
## 🌟 Support the Project
If this project is helpful:
⭐ **Give this repository a star**
🍴 **Fork it and improve it**
πŸ“’ **Share it with others learning PyTorch**
> Following and starring helps me keep releasing open-source projects!
---
## πŸ“Œ Source Code & Updates
For the **full source code, training scripts, and future updates**,
please visit the **GitHub repository** linked to this project.
---
## πŸ“œ License
This project is **open-source** and intended for **educational purposes**.
MIT License applies.
---
## πŸ€— Hugging Face Friendly
You can also **upload this model to Hugging Face Model Hub**.
Include `spam_nn.pth`, `vectorizer.pkl`, `config.json`, and `inference.py` to make it **ready for inference online**.