Spaces:
Runtime error
title: NetworkSecurity
emoji: π»
colorFrom: blue
colorTo: blue
sdk: docker
pinned: false
license: mit
π‘οΈ Network Security System: Phishing URL Detection
π Table of Contents
- About The Project
- Architecture
- Features
- Tech Stack
- Dataset
- Project Structure
- Pipeline Workflow
- Screenshots
- Installation
- Usage
- Model Performance
- Experiment Tracking
- Future Enhancements
- Contributing
- License
- Contact
π Live Demo
- Live Application: inderjeet-networksecurity.hf.space
- Experiment Tracking: DagsHub Experiments
π― About The Project
In the digital age, cybersecurity threats such as phishing attacks are becoming increasingly sophisticated. This project implements a robust Network Security Machine Learning Pipeline designed to detect phishing URLs with high accuracy.
It leverages a modular MLOps architecture, ensuring scalability, maintainability, and reproducibility. The system automates the entire flow from data ingestion to model deployment, utilizing advanced techniques like drift detection and automated model evaluation.
ποΈ Architecture
The system follows a strict modular pipeline architecture, orchestrated by a central training pipeline.
β¨ Features
- π End-to-End Pipeline: Fully automated workflow from data ingestion to model deployment.
- π‘οΈ Data Validation: Comprehensive schema checks and data drift detection using KS tests.
- π Robust Preprocessing: Automated handling of missing values (KNN Imputer) and feature scaling (Robust Scaler).
- π€ Multi-Model Training: Experiments with RandomForest, DecisionTree, GradientBoosting, and AdaBoost using GridSearchCV.
- π Experiment Tracking: Integrated with MLflow and DagsHub for tracking parameters, metrics, and models.
- β‘ Fast API: High-performance REST API built with FastAPI for real-time predictions.
- π³ Containerized: Docker support for consistent deployment across environments.
- βοΈ Cloud Ready: Designed to be deployed on platforms like AWS or Hugging Face Spaces.
π οΈ Tech Stack
- Languages: Python 3.8+
- Frameworks: FastAPI, Uvicorn
- ML Libraries: Scikit-learn, Pandas, NumPy
- MLOps: MLflow, DagsHub
- Database: MongoDB
- Containerization: Docker
- Frontend: HTML, CSS (Custom Design System), JavaScript
π Dataset
The project utilizes a dataset containing various URL features to distinguish between legitimate and phishing URLs.
- Source: Phishing Dataset for Machine Learning (or similar Phishing URL dataset)
- Features: IP Address, URL Length, TinyURL, forwarding, etc.
- Target:
Result(LEGITIMATE / PHISHING)
π Project Structure
NetworkSecurity/
βββ images/ # Project diagrams and screenshots
βββ networksecurity/ # Main package
β βββ components/ # Pipeline components (Ingestion, Validation, Transformation, Training)
β βββ pipeline/ # Training and Prediction pipelines
β βββ entity/ # Artifact and Config entities
β βββ constants/ # Project constants
β βββ utils/ # Utility functions
β βββ exception/ # Custom exception handling
βββ data_schema/ # Schema definitions
βββ Dockerfile # Docker configuration
βββ app.py # FastAPI application entry point
βββ requirements.txt # Project dependencies
βββ README.md # Project documentation
βοΈ Pipeline Workflow
1. Data Ingestion π₯
Fetches data from MongoDB, handles fallback to local CSV, and performs train-test split.

2. Data Validation β
Validates data against schema and checks for data drift.

3. Data Transformation π
Imputes missing values and scales features for optimal model performance.

4. Model Training π€
Trains and tunes multiple models, selecting the best one based on F1-score/Accuracy.

πΈ Screenshots
Prediction Results & Threat Assessment
Experiment Tracking (DagsHub/MLflow)
π» Installation
Prerequisites
- Python 3.8+
- MongoDB Account
- DagsHub Account (for experiment tracking)
Step-by-Step
Clone the Repository
git clone https://github.com/Inder-26/NetworkSecurity.git cd NetworkSecurityCreate Virtual Environment
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activateInstall Dependencies
pip install -r requirements.txtSet Environment Variables Create a
.envfile with your credentials:MONGO_DB_URL=your_mongodb_url_here MLFLOW_TRACKING_URI=https://dagshub.com/your_username/project.mlflow MLFLOW_TRACKING_USERNAME=your_username MLFLOW_TRACKING_PASSWORD=your_password
π Usage
Run the Web Application
python app.py
Visit http://localhost:8000 to access the UI.
Train a New Model
To trigger the training pipeline:
http://localhost:8000/train
Or use the "Train New Model" button in the UI.
π Model Performance
The system evaluates models using accuracy and F1 score.
- Best Model: [Automatically selected, typically RandomForest or GradientBoosting]
- Recall: Optimized to minimize false negatives (missing a phishing URL is dangerous).
Model Evaluation Metrics
Below are the performance visualizations for the best trained model:
Confusion Matrix
ROC Curve
Precision-Recall Curve
π§ͺ Experiment Tracking
All runs are logged to DagsHub. You can view parameters, metrics, and models in the MLflow UI.
π Future Enhancements
- Implement Deep Learning models (LSTM/CNN) for URL text analysis.
- Add real-time browser extension.
- Deploy serverless architecture.
- Add more comprehensive unit and integration tests.
π€ Contributing
Contributions are welcome! Please fork the repository and create a pull request.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
π License
Distributed under the MIT License. See LICENSE for more information.
π Contact
Inder - GitHub Profile










