---
title: Amide Models
emoji: 🔬
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: app.py
pinned: false
---

# Amide Models - Breach Prediction API

A Flask-based API for network breach prediction using multiple machine learning models.

## Overview

This project provides a REST API for predicting network security breaches using three different machine learning models:

- **LightGBM**: Gradient boosting model for breach classification
- **Autoencoder**: Deep learning model for anomaly detection
- **XGB-LSTM**: Hybrid XGBoost and LSTM model for sequence prediction

## Features

- REST API endpoints for model inference
- Support for multiple ML models
- Input validation and error handling
- Health check endpoint
- Model information endpoint
- Docker containerized deployment

## API Endpoints

### GET `/health`
Health check endpoint to verify the service is running.

**Response:**
```json
{
  "status": "healthy"
}
```

### GET `/models`
Returns available models and their configuration.

**Response:**
```json
{
  "available_models": {
    "lightGBM": {
      "file": "lightGBM.py",
      "available": true,
      "interface": "hardcoded"
    },
    "autoencoder": {
      "file": "autoencoder.py",
      "available": true,
      "interface": "hardcoded"
    },
    "XGB_lstm": {
      "file": "XGB_lstm.py",
      "available": true,
      "interface": "argparse"
    }
  },
  "required_columns": ["timestamp", "src_ip", "dst_ip", "src_port", "dst_port"]
}
```

### POST `/compute`
Run breach prediction using **all 3 models simultaneously** on network logs.

**Request:**
```json
{
  "file": [
    {
      "timestamp": "2024-01-01T10:00:00",
      "src_ip": "192.168.1.100",
      "dst_ip": "10.0.0.1",
      "src_port": 12345,
      "dst_port": 80,
      "packet_size": 1500,
      "seq": 1000,
      "ack": 2000,
      "tcp_flags": 2,
      "window": 65535
    },
    {
      "timestamp": "2024-01-01T10:00:01",
      "src_ip": "192.168.1.101",
      "dst_ip": "10.0.0.2",
      "src_port": 12346,
      "dst_port": 443,
      "packet_size": 1500,
      "seq": 1001,
      "ack": 2001,
      "tcp_flags": 2,
      "window": 65535
    }
  ]
}
```

**Response:**
```json
{
  "success": true,
  "packets": {
    "total": 2,
    "unique_flows": 2
  },
  "models": {
    "lightGBM": {
      "success": true,
      "output": "Model execution output",
      "predictions": [
        {
          "timestamp": "2024-01-01T10:00:00",
          "src_ip": "192.168.1.100",
          "breach_probability": 0.95,
          "breach_predicted": 1
        }
      ],
      "error": null
    },
    "autoencoder": {
      "success": true,
      "output": "Model execution output",
      "predictions": [
        {
          "timestamp": "2024-01-01T10:00:00",
          "anomaly_score": 0.87,
          "is_anomaly": true
        }
      ],
      "error": null
    },
    "XGB_lstm": {
      "success": true,
      "output": "Model execution output",
      "predictions": [
        {
          "timestamp": "2024-01-01T10:00:00",
          "breach_risk": 0.92,
          "prediction": 1
        }
      ],
      "error": null
    }
  }
}
```

**Response Format:**
- `success`: Overall success status (all models succeeded)
- `packets.total`: Total number of packets in the request
- `packets.unique_flows`: Number of unique network flows (src_ip:src_port → dst_ip:dst_port)
- `models`: Dictionary containing results from each model with the same name as the model
  - Each model includes: `success` (bool), `output` (stdout), `predictions` (array), `error` (stderr)

## Required Input Columns

- `timestamp`: Timestamp of the network flow
- `src_ip`: Source IP address
- `dst_ip`: Destination IP address
- `src_port`: Source port
- `dst_port`: Destination port

Additional columns like `packet_size`, `seq`, `ack` are recommended for better predictions.

## Installation

### Local Setup

```bash
pip install -r requirements.txt
python app.py
```

The API will be available at `http://localhost:5000`

### Docker Setup

```bash
docker build -t amide-models .
docker run -p 5000:5000 amide-models
```

## Requirements

- Python 3.8+
- Flask
- Flask-CORS
- pandas
- numpy
- scikit-learn
- tensorflow
- lightgbm
- xgboost

See `requirements.txt` for all dependencies.

## Models

### LightGBM
Gradient boosting classifier optimized for network breach detection with 60% threshold for breach classification.

### Autoencoder
Deep learning model using autoencoders to detect anomalous network patterns.

### XGB-LSTM
Hybrid model combining XGBoost with LSTM for temporal sequence analysis of network flows.

## Output

All models generate predictions with:
- Breach probability score (0-1)
- Binary breach prediction (0 or 1)
- Confidence metrics

## Development

To modify or add new models:

1. Create a new Python file with your model
2. Update `MODEL_CONFIGS` in `app.py` with the new model configuration
3. Ensure your model can accept input via:
   - Hardcoded filename: reads from `network_logs.csv`
   - Argparse: accepts `--logfile` command-line argument

## License

MIT

## Contact

For issues and questions, please check the repository documentation.