Spaces:
Running
Running
File size: 5,733 Bytes
f413108 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | ---
title: AutoML MLOps Pipeline
emoji: π€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
pinned: false
license: mit
---
# π€ AutoML MLOps Pipeline
Production-ready end-to-end AutoML pipeline with MLflow tracking, comprehensive monitoring, and automated orchestration.
[](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/ci.yaml)
[](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/docker-build.yaml)
## π Features
- **π€ AutoML**: AutoGluon, FLAML, PyCaret integration
- **π MLflow Tracking**: DagsHub integration with comprehensive metrics
- **π Monitoring**: Drift detection, prediction logging, performance tracking
- **π Observability**: Prometheus metrics & Grafana dashboards
- **π Orchestration**: Airflow DAGs for automated scheduling
- **π³ Docker**: Complete containerization with docker-compose
- **β‘ FastAPI**: RESTful API with 11+ endpoints
- **π― CI/CD**: GitHub Actions for automated testing and deployment
## π Pipeline Stages
1. **Data Ingestion** - Load and validate dataset
2. **Data Validation** - Schema validation and quality checks
3. **Data Transformation** - Feature engineering and preprocessing
4. **AutoML Training** - Multi-framework model training
5. **Model Evaluation** - Comprehensive metrics and validation
6. **Model Comparison** - Best model selection
7. **Model Pusher** - Production model deployment
## π οΈ Tech Stack
- **ML Frameworks**: AutoGluon, FLAML, PyCaret
- **API**: FastAPI, Uvicorn
- **Tracking**: MLflow, DagsHub
- **Monitoring**: Prometheus, Grafana, Evidently AI
- **Orchestration**: Apache Airflow
- **Containerization**: Docker, Docker Compose
- **CI/CD**: GitHub Actions
## π¦ Quick Start
### Local Development
```bash
# Clone repository
git clone https://github.com/Abeshith/AutoML-MLOps-PipeLine.git
cd AutoML-MLOps-PipeLine
# Create virtual environment
python -m venv automlenv
source automlenv/bin/activate # On Windows: automlenv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set environment variables
cp .env.example .env
# Edit .env with your credentials
# Run training pipeline
python scripts/train.py
# Start API server
python scripts/serve.py --reload
```
### Docker Deployment
```bash
# Start all services
docker-compose up -d
# Access services
# API: http://localhost:8000/docs
# Prometheus: http://localhost:9090
# Grafana: http://localhost:3000 (admin/admin)
```
## π API Endpoints
### Prediction
```bash
POST /predict
{
"age": 45,
"sex": 1,
"cp": 2,
"trestbps": 130,
"chol": 250,
"fbs": 0,
"restecg": 1,
"thalach": 150,
"exang": 0,
"oldpeak": 2.5,
"slope": 2,
"ca": 0,
"thal": 2
}
```
### Training
```bash
POST /train
GET /train/status
```
### Monitoring
```bash
GET /monitoring/metrics # Prometheus metrics
GET /monitoring/health/drift # Drift detection status
GET /monitoring/performance/summary
GET /monitoring/reports/daily
```
## π Model Performance
- **Validation Accuracy**: 88.84%
- **Test Accuracy**: 88.68%
- **ROC-AUC**: 95.48%
- **Best Model**: WeightedEnsemble_L3
## π§ Utility Scripts
```bash
# Train model
python scripts/train.py
# Evaluate model
python scripts/evaluate.py --model-path <path>
# Start API server
python scripts/serve.py --host 0.0.0.0 --port 8000 --reload
# Initialize Airflow
python scripts/init_db.py
```
## π Airflow Orchestration
```bash
# Set AIRFLOW_HOME
export AIRFLOW_HOME=$(pwd)/airflow
# Initialize database
python scripts/init_db.py
# Start services
airflow scheduler # Terminal 1
airflow webserver # Terminal 2
# Access UI: http://localhost:8080
```
## π Monitoring Stack
- **Drift Detection**: KS test for numerical features
- **Prediction Logging**: JSONL format with threading
- **Performance Tracking**: Batch-level metrics
- **Report Generation**: Daily/weekly JSON reports
- **Prometheus Metrics**: Request count, latency, accuracy, drift status
- **Grafana Dashboards**: 5-panel visualization
## π³ Docker Services
- **FastAPI App** (8000): Main ML API
- **Prometheus** (9090): Metrics collection
- **Grafana** (3000): Visualization dashboards
## π Environment Variables
```env
MLFLOW_TRACKING_URI=your_dagshub_uri
DAGSHUB_TOKEN=your_token
```
## π Documentation
- [Docker Setup](DOCKER.md)
- [Scripts Usage](scripts/README.md)
- [CI/CD Workflows](.github/workflows/README.md)
- [Airflow Guide](airflow/README.md)
## π§ͺ CI/CD Pipeline
### Automated Workflows
- **CI**: Lint with flake8, format check with black
- **Docker Build**: Build and push to GitHub Container Registry
- **HuggingFace Deploy**: Auto-deploy to Spaces on push
### Container Images
```bash
docker pull ghcr.io/abeshith/automl-mlops-pipeline:latest
```
## π Project Structure
```
AutoML-MLOps-PipeLine/
βββ src/mlpipeline/ # Core pipeline components
βββ app/ # FastAPI application
βββ config/ # Configuration files
βββ scripts/ # Utility scripts
βββ airflow/ # Airflow DAGs
βββ monitoring/ # Monitoring components
βββ observability/ # Prometheus/Grafana configs
βββ notebooks/ # Jupyter notebooks
βββ Dockerfile # Container definition
βββ docker-compose.yaml # Multi-service orchestration
βββ requirements.txt # Python dependencies
```
β Star this repo if you find it helpful!
|