File size: 3,317 Bytes
d8d50b7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
---
license: mit
language:
- en
tags:
- email
- spam
- spamdetection
---
# π© Spam Detection Neural Network (PyTorch)
[](https://www.python.org/)
[](https://pytorch.org/)
[](LICENSE)
A **simple, real-world spam detection neural network** built from scratch in **PyTorch**.
This model classifies SMS / short text messages as **Spam** or **Ham (Not Spam)**.
The project is **small, easy to understand, and perfect for learning**.
You can fork it, fine-tune it, and use it as a **starting point for your own projects**.
---
## π§ Model Overview
- **Framework:** PyTorch
- **Architecture:** Fully Connected Neural Network (MLP)
- **Input:** Bag-of-Words text vectors
- **Output:** Binary classification (Spam / Ham)
- **Training:** From scratch, small dataset (~5,500 messages)
> β οΈ Note: The dataset is intentionally small to keep things simple.
> You are encouraged to **fork the repo, add more data, and fine-tune the model**.
---
## π Repository Structure
```
.
βββ spam_nn.pth # Trained PyTorch model weights
βββ vectorizer.pkl # CountVectorizer for text preprocessing
βββ model.py # Neural network architecture
βββ config.json # Model configuration
βββ inference.py # Inference / prediction script
βββ README.md # Documentation
````
---
## π Usage
### Load Model
```python
import torch
from model import SpamNN
import pickle
# Load model architecture + weights
model = SpamNN()
model.load_state_dict(torch.load("spam_nn.pth"))
model.eval()
# Load vectorizer
with open("vectorizer.pkl", "rb") as f:
vectorizer = pickle.load(f)
````
### Predict Messages
```python
def predict(text):
vec = vectorizer.transform([text]).toarray()
vec = torch.tensor(vec, dtype=torch.float32)
with torch.no_grad():
output = model(vec)
return "Spam" if output.item() > 0.35 else "Ham"
# Example
print(predict("Congratulations! You won $1000. Click now!"))
```
---
## π§ Training & Fine-Tuning
The model can be **improved and fine-tuned** by:
* Adding more data (larger SMS datasets)
* Increasing n-grams (`ngram_range=(1,2)`)
* Adjusting class weights in `BCEWithLogitsLoss`
* Training with more epochs
* Using embeddings or LSTM for contextual understanding
π‘ **Fork this repo and experiment freely**. Make it your own!
---
## π Support the Project
If this project is helpful:
β **Give this repository a star**
π΄ **Fork it and improve it**
π’ **Share it with others learning PyTorch**
> Following and starring helps me keep releasing open-source projects!
---
## π Source Code & Updates
For the **full source code, training scripts, and future updates**,
please visit the **GitHub repository** linked to this project.
---
## π License
This project is **open-source** and intended for **educational purposes**.
MIT License applies.
---
## π€ Hugging Face Friendly
You can also **upload this model to Hugging Face Model Hub**.
Include `spam_nn.pth`, `vectorizer.pkl`, `config.json`, and `inference.py` to make it **ready for inference online**. |