Spaces:

samarthnaikk
/

amide-models

Sleeping

App Files Files Community

amide-models / README.md

Samarth Naik

Update /compute endpoint to run all 3 models simultaneously with packet counting

a7252f1 2 months ago

preview code

raw

history blame contribute delete

5.16 kB

	---
	title: Amide Models
	emoji: 🔬
	colorFrom: blue
	colorTo: indigo
	sdk: docker
	app_file: app.py
	pinned: false
	---

	# Amide Models - Breach Prediction API

	A Flask-based API for network breach prediction using multiple machine learning models.

	## Overview

	This project provides a REST API for predicting network security breaches using three different machine learning models:

	- LightGBM: Gradient boosting model for breach classification
	- Autoencoder: Deep learning model for anomaly detection
	- XGB-LSTM: Hybrid XGBoost and LSTM model for sequence prediction

	## Features

	- REST API endpoints for model inference
	- Support for multiple ML models
	- Input validation and error handling
	- Health check endpoint
	- Model information endpoint
	- Docker containerized deployment

	## API Endpoints

	### GET `/health`
	Health check endpoint to verify the service is running.

	Response:
	```json
	{
	"status": "healthy"
	}
	```

	### GET `/models`
	Returns available models and their configuration.

	Response:
	```json
	{
	"available_models": {
	"lightGBM": {
	"file": "lightGBM.py",
	"available": true,
	"interface": "hardcoded"
	},
	"autoencoder": {
	"file": "autoencoder.py",
	"available": true,
	"interface": "hardcoded"
	},
	"XGB_lstm": {
	"file": "XGB_lstm.py",
	"available": true,
	"interface": "argparse"
	}
	},
	"required_columns": ["timestamp", "src_ip", "dst_ip", "src_port", "dst_port"]
	}
	```

	### POST `/compute`
	Run breach prediction using all 3 models simultaneously on network logs.

	Request:
	```json
	{
	"file": [
	{
	"timestamp": "2024-01-01T10:00:00",
	"src_ip": "192.168.1.100",
	"dst_ip": "10.0.0.1",
	"src_port": 12345,
	"dst_port": 80,
	"packet_size": 1500,
	"seq": 1000,
	"ack": 2000,
	"tcp_flags": 2,
	"window": 65535
	},
	{
	"timestamp": "2024-01-01T10:00:01",
	"src_ip": "192.168.1.101",
	"dst_ip": "10.0.0.2",
	"src_port": 12346,
	"dst_port": 443,
	"packet_size": 1500,
	"seq": 1001,
	"ack": 2001,
	"tcp_flags": 2,
	"window": 65535
	}
	]
	}
	```

	Response:
	```json
	{
	"success": true,
	"packets": {
	"total": 2,
	"unique_flows": 2
	},
	"models": {
	"lightGBM": {
	"success": true,
	"output": "Model execution output",
	"predictions": [
	{
	"timestamp": "2024-01-01T10:00:00",
	"src_ip": "192.168.1.100",
	"breach_probability": 0.95,
	"breach_predicted": 1
	}
	],
	"error": null
	},
	"autoencoder": {
	"success": true,
	"output": "Model execution output",
	"predictions": [
	{
	"timestamp": "2024-01-01T10:00:00",
	"anomaly_score": 0.87,
	"is_anomaly": true
	}
	],
	"error": null
	},
	"XGB_lstm": {
	"success": true,
	"output": "Model execution output",
	"predictions": [
	{
	"timestamp": "2024-01-01T10:00:00",
	"breach_risk": 0.92,
	"prediction": 1
	}
	],
	"error": null
	}
	}
	}
	```

	Response Format:
	- `success`: Overall success status (all models succeeded)
	- `packets.total`: Total number of packets in the request
	- `packets.unique_flows`: Number of unique network flows (src_ip:src_port → dst_ip:dst_port)
	- `models`: Dictionary containing results from each model with the same name as the model
	- Each model includes: `success` (bool), `output` (stdout), `predictions` (array), `error` (stderr)

	## Required Input Columns

	- `timestamp`: Timestamp of the network flow
	- `src_ip`: Source IP address
	- `dst_ip`: Destination IP address
	- `src_port`: Source port
	- `dst_port`: Destination port

	Additional columns like `packet_size`, `seq`, `ack` are recommended for better predictions.

	## Installation

	### Local Setup

	```bash
	pip install -r requirements.txt
	python app.py
	```

	The API will be available at `http://localhost:5000`

	### Docker Setup

	```bash
	docker build -t amide-models .
	docker run -p 5000:5000 amide-models
	```

	## Requirements

	- Python 3.8+
	- Flask
	- Flask-CORS
	- pandas
	- numpy
	- scikit-learn
	- tensorflow
	- lightgbm
	- xgboost

	See `requirements.txt` for all dependencies.

	## Models

	### LightGBM
	Gradient boosting classifier optimized for network breach detection with 60% threshold for breach classification.

	### Autoencoder
	Deep learning model using autoencoders to detect anomalous network patterns.

	### XGB-LSTM
	Hybrid model combining XGBoost with LSTM for temporal sequence analysis of network flows.

	## Output

	All models generate predictions with:
	- Breach probability score (0-1)
	- Binary breach prediction (0 or 1)
	- Confidence metrics

	## Development

	To modify or add new models:

	1. Create a new Python file with your model
	2. Update `MODEL_CONFIGS` in `app.py` with the new model configuration
	3. Ensure your model can accept input via:
	- Hardcoded filename: reads from `network_logs.csv`
	- Argparse: accepts `--logfile` command-line argument

	## License

	MIT

	## Contact

	For issues and questions, please check the repository documentation.