--- license: mit language: - en tags: - email - spam - spamdetection --- # 📩 Spam Detection Neural Network (PyTorch) [![Python](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/) [![PyTorch](https://img.shields.io/badge/pytorch-2.1-red.svg)](https://pytorch.org/) [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE) A **simple, real-world spam detection neural network** built from scratch in **PyTorch**. This model classifies SMS / short text messages as **Spam** or **Ham (Not Spam)**. The project is **small, easy to understand, and perfect for learning**. You can fork it, fine-tune it, and use it as a **starting point for your own projects**. --- ## 🧠 Model Overview - **Framework:** PyTorch - **Architecture:** Fully Connected Neural Network (MLP) - **Input:** Bag-of-Words text vectors - **Output:** Binary classification (Spam / Ham) - **Training:** From scratch, small dataset (~5,500 messages) > ⚠️ Note: The dataset is intentionally small to keep things simple. > You are encouraged to **fork the repo, add more data, and fine-tune the model**. --- ## 📂 Repository Structure ``` . ├── spam_nn.pth # Trained PyTorch model weights ├── vectorizer.pkl # CountVectorizer for text preprocessing ├── model.py # Neural network architecture ├── config.json # Model configuration ├── inference.py # Inference / prediction script ├── README.md # Documentation ```` --- ## 🚀 Usage ### Load Model ```python import torch from model import SpamNN import pickle # Load model architecture + weights model = SpamNN() model.load_state_dict(torch.load("spam_nn.pth")) model.eval() # Load vectorizer with open("vectorizer.pkl", "rb") as f: vectorizer = pickle.load(f) ```` ### Predict Messages ```python def predict(text): vec = vectorizer.transform([text]).toarray() vec = torch.tensor(vec, dtype=torch.float32) with torch.no_grad(): output = model(vec) return "Spam" if output.item() > 0.35 else "Ham" # Example print(predict("Congratulations! You won $1000. Click now!")) ``` --- ## 🔧 Training & Fine-Tuning The model can be **improved and fine-tuned** by: * Adding more data (larger SMS datasets) * Increasing n-grams (`ngram_range=(1,2)`) * Adjusting class weights in `BCEWithLogitsLoss` * Training with more epochs * Using embeddings or LSTM for contextual understanding 💡 **Fork this repo and experiment freely**. Make it your own! --- ## 🌟 Support the Project If this project is helpful: ⭐ **Give this repository a star** 🍴 **Fork it and improve it** 📢 **Share it with others learning PyTorch** > Following and starring helps me keep releasing open-source projects! --- ## 📌 Source Code & Updates For the **full source code, training scripts, and future updates**, please visit the **GitHub repository** linked to this project. --- ## 📜 License This project is **open-source** and intended for **educational purposes**. MIT License applies. --- ## 🤗 Hugging Face Friendly You can also **upload this model to Hugging Face Model Hub**. Include `spam_nn.pth`, `vectorizer.pkl`, `config.json`, and `inference.py` to make it **ready for inference online**.