Abeshith commited on
Commit
f413108
Β·
1 Parent(s): edcfcaa

Readme Updated

Browse files
Files changed (2) hide show
  1. HF_SPACE_README.md +0 -43
  2. README.md +244 -0
HF_SPACE_README.md DELETED
@@ -1,43 +0,0 @@
1
- ---
2
- title: AutoML MLOps Pipeline
3
- emoji: πŸ€–
4
- colorFrom: blue
5
- colorTo: green
6
- sdk: docker
7
- app_port: 8000
8
- pinned: false
9
- license: mit
10
- ---
11
-
12
- # AutoML MLOps Pipeline
13
-
14
- Production-ready AutoML pipeline with MLflow tracking, monitoring, and orchestration.
15
-
16
- ## Features
17
-
18
- - πŸ€– **AutoML**: AutoGluon, FLAML, PyCaret
19
- - πŸ“Š **MLflow Tracking**: DagsHub integration
20
- - πŸ” **Monitoring**: Drift detection, performance tracking
21
- - πŸ“ˆ **Observability**: Prometheus & Grafana
22
- - πŸ”„ **Orchestration**: Airflow scheduling
23
- - 🐳 **Docker**: Containerized deployment
24
-
25
- ## API Endpoints
26
-
27
- Once deployed, access:
28
- - **API Documentation**: `/docs`
29
- - **Health Check**: `/health`
30
- - **Predictions**: `/predict`
31
- - **Monitoring**: `/monitoring/metrics`
32
-
33
- ## Quick Test
34
-
35
- ```bash
36
- curl -X POST https://abeshith-automl-mlops-pipeline.hf.space/predict \
37
- -H "Content-Type: application/json" \
38
- -d '{"age": 45, "sex": 1, "cp": 2, "trestbps": 130, "chol": 250, "fbs": 0, "restecg": 1, "thalach": 150, "exang": 0, "oldpeak": 2.5, "slope": 2, "ca": 0, "thal": 2}'
39
- ```
40
-
41
- ## Repository
42
-
43
- Full source code: [GitHub](https://github.com/Abeshith/AutoML-MLOps-PipeLine)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: AutoML MLOps Pipeline
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ app_port: 8000
8
+ pinned: false
9
+ license: mit
10
+ ---
11
+
12
+ # πŸ€– AutoML MLOps Pipeline
13
+
14
+ Production-ready end-to-end AutoML pipeline with MLflow tracking, comprehensive monitoring, and automated orchestration.
15
+
16
+ [![CI Pipeline](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/ci.yaml/badge.svg)](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/ci.yaml)
17
+ [![Docker Build](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/docker-build.yaml/badge.svg)](https://github.com/Abeshith/AutoML-MLOps-PipeLine/actions/workflows/docker-build.yaml)
18
+
19
+ ## πŸš€ Features
20
+
21
+ - **πŸ€– AutoML**: AutoGluon, FLAML, PyCaret integration
22
+ - **πŸ“Š MLflow Tracking**: DagsHub integration with comprehensive metrics
23
+ - **πŸ” Monitoring**: Drift detection, prediction logging, performance tracking
24
+ - **πŸ“ˆ Observability**: Prometheus metrics & Grafana dashboards
25
+ - **πŸ”„ Orchestration**: Airflow DAGs for automated scheduling
26
+ - **🐳 Docker**: Complete containerization with docker-compose
27
+ - **⚑ FastAPI**: RESTful API with 11+ endpoints
28
+ - **🎯 CI/CD**: GitHub Actions for automated testing and deployment
29
+
30
+ ## πŸ“‹ Pipeline Stages
31
+
32
+ 1. **Data Ingestion** - Load and validate dataset
33
+ 2. **Data Validation** - Schema validation and quality checks
34
+ 3. **Data Transformation** - Feature engineering and preprocessing
35
+ 4. **AutoML Training** - Multi-framework model training
36
+ 5. **Model Evaluation** - Comprehensive metrics and validation
37
+ 6. **Model Comparison** - Best model selection
38
+ 7. **Model Pusher** - Production model deployment
39
+
40
+ ## πŸ› οΈ Tech Stack
41
+
42
+ - **ML Frameworks**: AutoGluon, FLAML, PyCaret
43
+ - **API**: FastAPI, Uvicorn
44
+ - **Tracking**: MLflow, DagsHub
45
+ - **Monitoring**: Prometheus, Grafana, Evidently AI
46
+ - **Orchestration**: Apache Airflow
47
+ - **Containerization**: Docker, Docker Compose
48
+ - **CI/CD**: GitHub Actions
49
+
50
+ ## πŸ“¦ Quick Start
51
+
52
+ ### Local Development
53
+
54
+ ```bash
55
+ # Clone repository
56
+ git clone https://github.com/Abeshith/AutoML-MLOps-PipeLine.git
57
+ cd AutoML-MLOps-PipeLine
58
+
59
+ # Create virtual environment
60
+ python -m venv automlenv
61
+ source automlenv/bin/activate # On Windows: automlenv\Scripts\activate
62
+
63
+ # Install dependencies
64
+ pip install -r requirements.txt
65
+
66
+ # Set environment variables
67
+ cp .env.example .env
68
+ # Edit .env with your credentials
69
+
70
+ # Run training pipeline
71
+ python scripts/train.py
72
+
73
+ # Start API server
74
+ python scripts/serve.py --reload
75
+ ```
76
+
77
+ ### Docker Deployment
78
+
79
+ ```bash
80
+ # Start all services
81
+ docker-compose up -d
82
+
83
+ # Access services
84
+ # API: http://localhost:8000/docs
85
+ # Prometheus: http://localhost:9090
86
+ # Grafana: http://localhost:3000 (admin/admin)
87
+ ```
88
+
89
+ ## 🌐 API Endpoints
90
+
91
+ ### Prediction
92
+ ```bash
93
+ POST /predict
94
+ {
95
+ "age": 45,
96
+ "sex": 1,
97
+ "cp": 2,
98
+ "trestbps": 130,
99
+ "chol": 250,
100
+ "fbs": 0,
101
+ "restecg": 1,
102
+ "thalach": 150,
103
+ "exang": 0,
104
+ "oldpeak": 2.5,
105
+ "slope": 2,
106
+ "ca": 0,
107
+ "thal": 2
108
+ }
109
+ ```
110
+
111
+ ### Training
112
+ ```bash
113
+ POST /train
114
+ GET /train/status
115
+ ```
116
+
117
+ ### Monitoring
118
+ ```bash
119
+ GET /monitoring/metrics # Prometheus metrics
120
+ GET /monitoring/health/drift # Drift detection status
121
+ GET /monitoring/performance/summary
122
+ GET /monitoring/reports/daily
123
+ ```
124
+
125
+ ## πŸ“Š Model Performance
126
+
127
+ - **Validation Accuracy**: 88.84%
128
+ - **Test Accuracy**: 88.68%
129
+ - **ROC-AUC**: 95.48%
130
+ - **Best Model**: WeightedEnsemble_L3
131
+
132
+ ## πŸ”§ Utility Scripts
133
+
134
+ ```bash
135
+ # Train model
136
+ python scripts/train.py
137
+
138
+ # Evaluate model
139
+ python scripts/evaluate.py --model-path <path>
140
+
141
+ # Start API server
142
+ python scripts/serve.py --host 0.0.0.0 --port 8000 --reload
143
+
144
+ # Initialize Airflow
145
+ python scripts/init_db.py
146
+ ```
147
+
148
+ ## πŸ”„ Airflow Orchestration
149
+
150
+ ```bash
151
+ # Set AIRFLOW_HOME
152
+ export AIRFLOW_HOME=$(pwd)/airflow
153
+
154
+ # Initialize database
155
+ python scripts/init_db.py
156
+
157
+ # Start services
158
+ airflow scheduler # Terminal 1
159
+ airflow webserver # Terminal 2
160
+
161
+ # Access UI: http://localhost:8080
162
+ ```
163
+
164
+ ## πŸ“ˆ Monitoring Stack
165
+
166
+ - **Drift Detection**: KS test for numerical features
167
+ - **Prediction Logging**: JSONL format with threading
168
+ - **Performance Tracking**: Batch-level metrics
169
+ - **Report Generation**: Daily/weekly JSON reports
170
+ - **Prometheus Metrics**: Request count, latency, accuracy, drift status
171
+ - **Grafana Dashboards**: 5-panel visualization
172
+
173
+ ## 🐳 Docker Services
174
+
175
+ - **FastAPI App** (8000): Main ML API
176
+ - **Prometheus** (9090): Metrics collection
177
+ - **Grafana** (3000): Visualization dashboards
178
+
179
+ ## πŸ” Environment Variables
180
+
181
+ ```env
182
+ MLFLOW_TRACKING_URI=your_dagshub_uri
183
+ DAGSHUB_TOKEN=your_token
184
+ ```
185
+
186
+ ## πŸ“š Documentation
187
+
188
+ - [Docker Setup](DOCKER.md)
189
+ - [Scripts Usage](scripts/README.md)
190
+ - [CI/CD Workflows](.github/workflows/README.md)
191
+ - [Airflow Guide](airflow/README.md)
192
+
193
+ ## πŸ§ͺ CI/CD Pipeline
194
+
195
+ ### Automated Workflows
196
+ - **CI**: Lint with flake8, format check with black
197
+ - **Docker Build**: Build and push to GitHub Container Registry
198
+ - **HuggingFace Deploy**: Auto-deploy to Spaces on push
199
+
200
+ ### Container Images
201
+ ```bash
202
+ docker pull ghcr.io/abeshith/automl-mlops-pipeline:latest
203
+ ```
204
+
205
+ ## πŸ“Š Project Structure
206
+
207
+ ```
208
+ AutoML-MLOps-PipeLine/
209
+ β”œβ”€β”€ src/mlpipeline/ # Core pipeline components
210
+ β”œβ”€β”€ app/ # FastAPI application
211
+ β”œβ”€β”€ config/ # Configuration files
212
+ β”œβ”€β”€ scripts/ # Utility scripts
213
+ β”œβ”€β”€ airflow/ # Airflow DAGs
214
+ β”œβ”€β”€ monitoring/ # Monitoring components
215
+ β”œβ”€β”€ observability/ # Prometheus/Grafana configs
216
+ β”œβ”€β”€ notebooks/ # Jupyter notebooks
217
+ β”œβ”€β”€ Dockerfile # Container definition
218
+ β”œβ”€β”€ docker-compose.yaml # Multi-service orchestration
219
+ └── requirements.txt # Python dependencies
220
+ ```
221
+
222
+ ## 🀝 Contributing
223
+
224
+ Contributions welcome! Please open an issue or submit a PR.
225
+
226
+ ## πŸ“„ License
227
+
228
+ MIT License - see [LICENSE](LICENSE) file
229
+
230
+ ## πŸ”— Links
231
+
232
+ - **GitHub**: https://github.com/Abeshith/AutoML-MLOps-PipeLine
233
+ - **HuggingFace Space**: https://huggingface.co/spaces/Abeshith/AutoML_MLOps_PipeLine
234
+ - **MLflow Tracking**: https://dagshub.com/abheshith7/AutoML-MLOps-PipeLine.mlflow
235
+
236
+ ## πŸ‘€ Author
237
+
238
+ **Abeshith**
239
+ - GitHub: [@Abeshith](https://github.com/Abeshith)
240
+ - HuggingFace: [@Abeshith](https://huggingface.co/Abeshith)
241
+
242
+ ---
243
+
244
+ ⭐ Star this repo if you find it helpful!