amide-models / README.md
Samarth Naik
Update /compute endpoint to run all 3 models simultaneously with packet counting
a7252f1
---
title: Amide Models
emoji: 🔬
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: app.py
pinned: false
---
# Amide Models - Breach Prediction API
A Flask-based API for network breach prediction using multiple machine learning models.
## Overview
This project provides a REST API for predicting network security breaches using three different machine learning models:
- **LightGBM**: Gradient boosting model for breach classification
- **Autoencoder**: Deep learning model for anomaly detection
- **XGB-LSTM**: Hybrid XGBoost and LSTM model for sequence prediction
## Features
- REST API endpoints for model inference
- Support for multiple ML models
- Input validation and error handling
- Health check endpoint
- Model information endpoint
- Docker containerized deployment
## API Endpoints
### GET `/health`
Health check endpoint to verify the service is running.
**Response:**
```json
{
"status": "healthy"
}
```
### GET `/models`
Returns available models and their configuration.
**Response:**
```json
{
"available_models": {
"lightGBM": {
"file": "lightGBM.py",
"available": true,
"interface": "hardcoded"
},
"autoencoder": {
"file": "autoencoder.py",
"available": true,
"interface": "hardcoded"
},
"XGB_lstm": {
"file": "XGB_lstm.py",
"available": true,
"interface": "argparse"
}
},
"required_columns": ["timestamp", "src_ip", "dst_ip", "src_port", "dst_port"]
}
```
### POST `/compute`
Run breach prediction using **all 3 models simultaneously** on network logs.
**Request:**
```json
{
"file": [
{
"timestamp": "2024-01-01T10:00:00",
"src_ip": "192.168.1.100",
"dst_ip": "10.0.0.1",
"src_port": 12345,
"dst_port": 80,
"packet_size": 1500,
"seq": 1000,
"ack": 2000,
"tcp_flags": 2,
"window": 65535
},
{
"timestamp": "2024-01-01T10:00:01",
"src_ip": "192.168.1.101",
"dst_ip": "10.0.0.2",
"src_port": 12346,
"dst_port": 443,
"packet_size": 1500,
"seq": 1001,
"ack": 2001,
"tcp_flags": 2,
"window": 65535
}
]
}
```
**Response:**
```json
{
"success": true,
"packets": {
"total": 2,
"unique_flows": 2
},
"models": {
"lightGBM": {
"success": true,
"output": "Model execution output",
"predictions": [
{
"timestamp": "2024-01-01T10:00:00",
"src_ip": "192.168.1.100",
"breach_probability": 0.95,
"breach_predicted": 1
}
],
"error": null
},
"autoencoder": {
"success": true,
"output": "Model execution output",
"predictions": [
{
"timestamp": "2024-01-01T10:00:00",
"anomaly_score": 0.87,
"is_anomaly": true
}
],
"error": null
},
"XGB_lstm": {
"success": true,
"output": "Model execution output",
"predictions": [
{
"timestamp": "2024-01-01T10:00:00",
"breach_risk": 0.92,
"prediction": 1
}
],
"error": null
}
}
}
```
**Response Format:**
- `success`: Overall success status (all models succeeded)
- `packets.total`: Total number of packets in the request
- `packets.unique_flows`: Number of unique network flows (src_ip:src_port → dst_ip:dst_port)
- `models`: Dictionary containing results from each model with the same name as the model
- Each model includes: `success` (bool), `output` (stdout), `predictions` (array), `error` (stderr)
## Required Input Columns
- `timestamp`: Timestamp of the network flow
- `src_ip`: Source IP address
- `dst_ip`: Destination IP address
- `src_port`: Source port
- `dst_port`: Destination port
Additional columns like `packet_size`, `seq`, `ack` are recommended for better predictions.
## Installation
### Local Setup
```bash
pip install -r requirements.txt
python app.py
```
The API will be available at `http://localhost:5000`
### Docker Setup
```bash
docker build -t amide-models .
docker run -p 5000:5000 amide-models
```
## Requirements
- Python 3.8+
- Flask
- Flask-CORS
- pandas
- numpy
- scikit-learn
- tensorflow
- lightgbm
- xgboost
See `requirements.txt` for all dependencies.
## Models
### LightGBM
Gradient boosting classifier optimized for network breach detection with 60% threshold for breach classification.
### Autoencoder
Deep learning model using autoencoders to detect anomalous network patterns.
### XGB-LSTM
Hybrid model combining XGBoost with LSTM for temporal sequence analysis of network flows.
## Output
All models generate predictions with:
- Breach probability score (0-1)
- Binary breach prediction (0 or 1)
- Confidence metrics
## Development
To modify or add new models:
1. Create a new Python file with your model
2. Update `MODEL_CONFIGS` in `app.py` with the new model configuration
3. Ensure your model can accept input via:
- Hardcoded filename: reads from `network_logs.csv`
- Argparse: accepts `--logfile` command-line argument
## License
MIT
## Contact
For issues and questions, please check the repository documentation.