Abeshith's picture
Upload folder using huggingface_hub
11fba5d verified
metadata
title: AutoML MLOps Pipeline
emoji: πŸ€–
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
pinned: false
license: mit

πŸ€– AutoML MLOps Pipeline

Production-ready end-to-end AutoML pipeline with MLflow tracking, comprehensive monitoring, and automated orchestration.

CI Pipeline Docker Build

πŸš€ Features

  • πŸ€– AutoML: AutoGluon, FLAML, PyCaret integration
  • πŸ“Š MLflow Tracking: DagsHub integration with comprehensive metrics
  • πŸ” Monitoring: Drift detection, prediction logging, performance tracking
  • πŸ“ˆ Observability: Prometheus metrics & Grafana dashboards
  • πŸ”„ Orchestration: Airflow DAGs for automated scheduling
  • 🐳 Docker: Complete containerization with docker-compose
  • ⚑ FastAPI: RESTful API with 11+ endpoints
  • 🎯 CI/CD: GitHub Actions for automated testing and deployment

πŸ“‹ Pipeline Stages

  1. Data Ingestion - Load and validate dataset
  2. Data Validation - Schema validation and quality checks
  3. Data Transformation - Feature engineering and preprocessing
  4. AutoML Training - Multi-framework model training
  5. Model Evaluation - Comprehensive metrics and validation
  6. Model Comparison - Best model selection
  7. Model Pusher - Production model deployment

πŸ› οΈ Tech Stack

  • ML Frameworks: AutoGluon, FLAML, PyCaret
  • API: FastAPI, Uvicorn
  • Tracking: MLflow, DagsHub
  • Monitoring: Prometheus, Grafana, Evidently AI
  • Orchestration: Apache Airflow
  • Containerization: Docker, Docker Compose
  • CI/CD: GitHub Actions

πŸ“¦ Quick Start

Local Development

# Clone repository
git clone https://github.com/Abeshith/AutoML-MLOps-PipeLine.git
cd AutoML-MLOps-PipeLine

# Create virtual environment
python -m venv automlenv
source automlenv/bin/activate  # On Windows: automlenv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set environment variables
cp .env.example .env
# Edit .env with your credentials

# Run training pipeline
python scripts/train.py

# Start API server
python scripts/serve.py --reload

Docker Deployment

# Start all services
docker-compose up -d

# Access services
# API: http://localhost:8000/docs
# Prometheus: http://localhost:9090
# Grafana: http://localhost:3000 (admin/admin)

🌐 API Endpoints

Prediction

POST /predict
{
  "age": 45,
  "sex": 1,
  "cp": 2,
  "trestbps": 130,
  "chol": 250,
  "fbs": 0,
  "restecg": 1,
  "thalach": 150,
  "exang": 0,
  "oldpeak": 2.5,
  "slope": 2,
  "ca": 0,
  "thal": 2
}

Training

POST /train
GET /train/status

Monitoring

GET /monitoring/metrics          # Prometheus metrics
GET /monitoring/health/drift     # Drift detection status
GET /monitoring/performance/summary
GET /monitoring/reports/daily

πŸ“Š Model Performance

  • Validation Accuracy: 88.84%
  • Test Accuracy: 88.68%
  • ROC-AUC: 95.48%
  • Best Model: WeightedEnsemble_L3

πŸ”§ Utility Scripts

# Train model
python scripts/train.py

# Evaluate model
python scripts/evaluate.py --model-path <path>

# Start API server
python scripts/serve.py --host 0.0.0.0 --port 8000 --reload

# Initialize Airflow
python scripts/init_db.py

πŸ”„ Airflow Orchestration

# Set AIRFLOW_HOME
export AIRFLOW_HOME=$(pwd)/airflow

# Initialize database
python scripts/init_db.py

# Start services
airflow scheduler  # Terminal 1
airflow webserver  # Terminal 2

# Access UI: http://localhost:8080

πŸ“ˆ Monitoring Stack

  • Drift Detection: KS test for numerical features
  • Prediction Logging: JSONL format with threading
  • Performance Tracking: Batch-level metrics
  • Report Generation: Daily/weekly JSON reports
  • Prometheus Metrics: Request count, latency, accuracy, drift status
  • Grafana Dashboards: 5-panel visualization

🐳 Docker Services

  • FastAPI App (8000): Main ML API
  • Prometheus (9090): Metrics collection
  • Grafana (3000): Visualization dashboards

πŸ” Environment Variables

MLFLOW_TRACKING_URI=your_dagshub_uri
DAGSHUB_TOKEN=your_token

πŸ“š Documentation

πŸ§ͺ CI/CD Pipeline

Automated Workflows

  • CI: Lint with flake8, format check with black
  • Docker Build: Build and push to GitHub Container Registry
  • HuggingFace Deploy: Auto-deploy to Spaces on push

Container Images

docker pull ghcr.io/abeshith/automl-mlops-pipeline:latest

πŸ“Š Project Structure

AutoML-MLOps-PipeLine/
β”œβ”€β”€ src/mlpipeline/          # Core pipeline components
β”œβ”€β”€ app/                      # FastAPI application
β”œβ”€β”€ config/                   # Configuration files
β”œβ”€β”€ scripts/                  # Utility scripts
β”œβ”€β”€ airflow/                  # Airflow DAGs
β”œβ”€β”€ monitoring/               # Monitoring components
β”œβ”€β”€ observability/            # Prometheus/Grafana configs
β”œβ”€β”€ notebooks/                # Jupyter notebooks
β”œβ”€β”€ Dockerfile                # Container definition
β”œβ”€β”€ docker-compose.yaml       # Multi-service orchestration
└── requirements.txt          # Python dependencies

⭐ Star this repo if you find it helpful!