Spaces:
Sleeping
Sleeping
| title: FYP4 Spam Detection API | |
| emoji: 📧 | |
| colorFrom: blue | |
| colorTo: red | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| # FYP4 Spam Detection API | |
| A powerful email spam detection system using state-of-the-art transformer models (DeBERTa-v3 for text and ViT for images) with multimodal fusion capabilities. | |
| ## Features | |
| - **Text-based Detection**: Uses Microsoft's DeBERTa-v3-base model for analyzing email text | |
| - **Image-based Detection**: Uses Google's ViT model for analyzing embedded images | |
| - **Multimodal Fusion**: Combines text and image features using cross-modal attention | |
| - **PDF Email Support**: Extracts and analyzes content from PDF email files | |
| - **RESTful API**: Easy-to-use FastAPI endpoints | |
| ## API Endpoints | |
| ### 1. Health Check | |
| ```bash | |
| GET /health | |
| ``` | |
| Returns the status of the API and loaded models. | |
| ### 2. Text Prediction | |
| ```bash | |
| POST /predict/text | |
| Content-Type: application/json | |
| { | |
| "text": "Your email text here" | |
| } | |
| ``` | |
| ### 3. PDF Prediction | |
| ```bash | |
| POST /predict/pdf | |
| Content-Type: multipart/form-data | |
| file: <PDF file> | |
| ``` | |
| ## Usage Examples | |
| ### Python | |
| ```python | |
| import requests | |
| # Text prediction | |
| response = requests.post( | |
| "https://YOUR-SPACE-URL/predict/text", | |
| json={"text": "Congratulations! You've won $1,000,000!"} | |
| ) | |
| print(response.json()) | |
| # PDF prediction | |
| with open("email.pdf", "rb") as f: | |
| response = requests.post( | |
| "https://YOUR-SPACE-URL/predict/pdf", | |
| files={"file": f} | |
| ) | |
| print(response.json()) | |
| ``` | |
| ### cURL | |
| ```bash | |
| # Text prediction | |
| curl -X POST "https://YOUR-SPACE-URL/predict/text" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"text": "Your email text"}' | |
| # PDF prediction | |
| curl -X POST "https://YOUR-SPACE-URL/predict/pdf" \ | |
| -F "file=@email.pdf" | |
| ``` | |
| ### JavaScript | |
| ```javascript | |
| // Text prediction | |
| const response = await fetch('https://YOUR-SPACE-URL/predict/text', { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ text: 'Your email text' }) | |
| }); | |
| const data = await response.json(); | |
| console.log(data); | |
| // PDF prediction | |
| const formData = new FormData(); | |
| formData.append('file', pdfFile); | |
| const response = await fetch('https://YOUR-SPACE-URL/predict/pdf', { | |
| method: 'POST', | |
| body: formData | |
| }); | |
| const data = await response.json(); | |
| console.log(data); | |
| ``` | |
| ## Response Format | |
| ### Text Prediction Response | |
| ```json | |
| { | |
| "prediction": "SPAM", | |
| "confidence": 95.67, | |
| "spam_probability": 95.67, | |
| "ham_probability": 4.33, | |
| "model_used": "text" | |
| } | |
| ``` | |
| ### PDF Prediction Response | |
| ```json | |
| { | |
| "email_data": { | |
| "subject": "Email subject", | |
| "sender": "sender@example.com", | |
| "body": "Email body content...", | |
| "full_text": "Complete email text..." | |
| }, | |
| "text_result": { | |
| "prediction": "SPAM", | |
| "confidence": 94.5, | |
| "spam_probability": 94.5, | |
| "ham_probability": 5.5 | |
| }, | |
| "image_result": { | |
| "prediction": "SPAM", | |
| "confidence": 92.3, | |
| "spam_probability": 92.3, | |
| "ham_probability": 7.7 | |
| }, | |
| "fusion_result": { | |
| "prediction": "SPAM", | |
| "confidence": 96.8, | |
| "spam_probability": 96.8, | |
| "ham_probability": 3.2 | |
| }, | |
| "final_prediction": "SPAM", | |
| "final_confidence": 96.8 | |
| } | |
| ``` | |
| ## Model Architecture | |
| ### Text Model (DeBERTa-v3-base) | |
| - Pre-trained Microsoft DeBERTa-v3-base | |
| - Custom projection layer to 512-dimensional fusion space | |
| - Multi-layer classifier with LayerNorm and GELU activation | |
| ### Image Model (ViT-base) | |
| - Pre-trained Google ViT-base-patch16-224 | |
| - Custom projection layer to 512-dimensional fusion space | |
| - Multi-layer classifier with LayerNorm and GELU activation | |
| ### Fusion Model | |
| - Combines text and image encoders | |
| - Cross-modal attention mechanism for feature fusion | |
| - Joint classification head for final prediction | |
| ## Setup Instructions | |
| 1. **Prepare your trained models**: Place your `.pth` model files in the `models/` directory: | |
| - `models/text_model.pth` | |
| - `models/image_model.pth` | |
| - `models/fusion_model.pth` | |
| 2. **Deploy to Hugging Face Spaces**: | |
| - Create a new Space on Hugging Face | |
| - Select Docker as the SDK | |
| - Upload all files from this repository | |
| - The API will automatically start on port 7860 | |
| ## Local Development | |
| ```bash | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the API | |
| python app.py | |
| ``` | |
| The API will be available at `http://localhost:7860` | |
| ## Requirements | |
| - Python 3.10+ | |
| - PyTorch 2.1.0+ | |
| - Transformers 4.35.2+ | |
| - FastAPI 0.104.1+ | |
| - See `requirements.txt` for complete list | |
| ## Model Files | |
| ⚠️ **Important**: This repository does not include the trained model weights. You need to: | |
| 1. Train the models using the training script | |
| 2. Save the model checkpoints (`.pth` files) | |
| 3. Upload them to the `models/` directory in your Hugging Face Space | |
| ## Performance | |
| The models are optimized for: | |
| - **Accuracy**: High precision in spam detection | |
| - **Speed**: Fast inference on CPU/GPU | |
| - **Multimodal**: Leverages both text and image features | |
| - **Scalability**: Handles concurrent requests efficiently | |
| ## License | |
| MIT License - See LICENSE file for details | |
| ## Citation | |
| If you use this API in your research, please cite: | |
| ```bibtex | |
| @misc{fyp4_spam_detection, | |
| title={FYP4 Spam Detection: Multimodal Email Spam Classification}, | |
| author={Your Name}, | |
| year={2024}, | |
| howpublished={\url{https://huggingface.co/spaces/YOUR-USERNAME/fyp4-spam-detection}} | |
| } | |
| ``` | |
| ## Contact | |
| For questions or issues, please open an issue on GitHub or contact the author. | |
| --- | |
| Built with ❤️ using PyTorch, Transformers, and FastAPI | |