Spaces:
Runtime error
Runtime error
metadata
title: Phishing URL Detection API
emoji: π
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
license: mit
app_port: 7860
Phishing URL Detection API
A FastAPI-based REST API for detecting phishing URLs using machine learning. This service analyzes URL features and webpage content to classify URLs as legitimate or phishing attempts.
Features
- π Real-time URL Analysis: Extracts 43 features from URLs and their webpages
- π€ Machine Learning: Uses a stacking ensemble model for accurate predictions
- π Fast API: Built with FastAPI for high performance and automatic documentation
- π³ Docker Support: Containerized for easy deployment
- π Confidence Scores: Returns prediction confidence for better decision-making
- π CORS Enabled: Accessible from web browsers
Project Structure
url-phish-fastapi/
βββ main.py # FastAPI application
βββ model/
β βββ __init__.py # Package initialization
β βββ model.py # Model loading and prediction logic
β βββ url_feature_extractor.py # Feature extraction from URLs
β βββ url_stacking_model.joblib # Pre-trained ML model
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker configuration
βββ .dockerignore # Docker ignore patterns
βββ README.md # This file
API Endpoints
Health Check
- GET
/- Root endpoint - GET
/health- Health check endpoint
Prediction
- POST
/predict- Analyze a URL for phishing detection
Request Body:
{
"url": "http://example.com"
}
Response:
{
"url": "http://example.com",
"prediction": "legitimate",
"confidence": 0.95,
"predicted_label": 0,
"phish_probability": 0.05
}
Interactive Documentation
- Swagger UI:
http://localhost:7860/docs - ReDoc:
http://localhost:7860/redoc
Installation & Usage
Option 1: Local Development
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python app.py
- Access the API:
- API: http://localhost:7860
- Docs: http://localhost:7860/docs
Option 2: Docker (Recommended)
- Build the Docker image:
docker build -t phishing-url-api .
- Run the container:
docker run -p 7860:7860 phishing-url-api
- Access the API:
- API: http://localhost:7860
- Docs: http://localhost:7860/docs
Option 3: Docker with Custom Port
docker run -p 8000:8000 -e PORT=8000 phishing-url-api
Testing
Run the test script to verify the API is working:
python test_api.py
Or use curl:
# Health check
curl http://localhost:7860/health
# Predict URL
curl -X POST http://localhost:7860/predict \
-H "Content-Type: application/json" \
-d '{"url": "https://www.google.com"}'
Model Information
The API uses a stacking ensemble model that combines multiple base classifiers:
- Random Forest
- Gradient Boosting
- XGBoost
- LightGBM
- Logistic Regression (meta-model)
Features Extracted (43 total)
The model analyzes various HTML elements and webpage characteristics:
- Form elements (inputs, buttons, password fields)
- Media elements (images, videos, audio)
- Structural elements (divs, tables, lists)
- Content metrics (text length, title length)
- Interactive elements (links, scripts, iframes)
Dependencies
- FastAPI: Web framework
- Uvicorn: ASGI server
- Scikit-learn: Machine learning
- Pandas/NumPy: Data processing
- BeautifulSoup4: HTML parsing
- Requests: HTTP requests
- XGBoost/LightGBM: Gradient boosting models
Error Handling
The API handles various error scenarios:
- 400 Bad Request: Invalid or empty URL
- 500 Internal Server Error: Model loading or prediction failures
- Unknown Prediction: When URL is unreachable or feature extraction fails
Performance Considerations
- Model is loaded once on startup (singleton pattern)
- Feature extraction may take 5-10 seconds for live URLs
- Unreachable URLs return "unknown" prediction
- HTTPS verification is disabled for broader compatibility
Security Notes
- The API makes HTTP requests to analyze URLs
- SSL verification is disabled for feature extraction
- Use appropriate network security when deploying
- Consider rate limiting for production use
Deployment
HuggingFace Spaces
This project is configured for deployment on HuggingFace Spaces using Docker SDK.
Other Platforms
The Docker container can be deployed on:
- AWS ECS/Fargate
- Google Cloud Run
- Azure Container Instances
- Kubernetes
- Any Docker-compatible platform
License
[Add your license information here]
Contributing
[Add contribution guidelines here]
Support
For issues and questions, please create an issue.