metadata
title: Fraud Detection MLOps API
emoji: π¨
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
Fraud Detection MLOps Pipeline
Production-style end-to-end fraud detection system with training, experiment tracking, API serving, containerization, CI/CD, and runtime monitoring.
Highlights
- End-to-end ML lifecycle: data validation -> preprocessing -> training -> threshold tuning -> API inference.
- Imbalanced classification handling with recall-first model ranking.
- MLflow experiment tracking and artifact logging.
- FastAPI inference service with single/batch prediction endpoints.
- Dockerized deployment with health checks and non-root runtime.
- CI/CD with automated tests, coverage gates, image build, and HF deployment sync.
- Runtime observability via request IDs, structured logs, and
/metrics.
Live Deployment
- Hugging Face Space:
https://thasvithu-fraud-detection-mlops-api.hf.space - API Docs:
https://thasvithu-fraud-detection-mlops-api.hf.space/docs
Architecture
flowchart LR
A[Raw Data<br/>data/raw/creditcard.csv] --> B[Data Validation<br/>src/data_ingestion.py]
B --> C[Preprocessing<br/>src/preprocessing.py]
C --> D[Model Training<br/>src/train.py]
D --> E[Evaluation + Threshold Tuning<br/>src/evaluate.py]
E --> F[Artifacts<br/>models/*.pkl<br/>artifacts/*.json]
F --> G[Inference Service<br/>api/service.py]
G --> H[FastAPI App<br/>api/app.py]
H --> I["/predict"]
H --> J["/predict/batch"]
H --> K["/health"]
H --> L["/metrics"]
ML Training Workflow
flowchart TD
T1[Load Config<br/>configs/train.yaml] --> T2[Validate Dataset]
T2 --> T3[Split + Scale + Imbalance Handling]
T3 --> T4[Train Candidate Models]
T4 --> T5[Compute Metrics]
T5 --> T6[Log Runs to MLflow]
T6 --> T7[Rank by recall -> precision -> roc_auc]
T7 --> T8[Select Best Model]
T8 --> T9[Threshold Sweep + Selection]
T9 --> T10[Save model + preprocessor + reports]
Inference Request Flow
sequenceDiagram
autonumber
participant Client
participant API as FastAPI
participant Svc as InferenceService
participant Art as Artifacts
Client->>API: POST /predict (transaction payload)
API->>Svc: load_inference_service() [cached]
Svc->>Art: model.pkl + preprocessor.pkl + threshold reports
Svc-->>API: prediction + probability + risk level
API-->>Client: JSON response (+ request headers)
CI/CD and Deployment Workflows
flowchart LR
P[Push / PR] --> C1[ci.yml]
C1 --> C2[Test + Coverage Gate]
C2 --> C3[Build Docker Image]
C3 --> C4[Optional Deploy Webhook]
M[Push main] --> H1[deploy-hf-space.yml]
H1 --> H2[Snapshot Sync to HF Space]
S[Schedule Mon/Wed/Fri] --> K1[keepalive-hf-space.yml]
K1 --> K2[Ping /health and /metrics]
Project Structure
fraud-detection-mlops-pipeline/
βββ api/
β βββ app.py
β βββ schemas.py
β βββ service.py
βββ src/
β βββ data_ingestion.py
β βββ preprocessing.py
β βββ train.py
β βββ evaluate.py
β βββ predict.py
β βββ register_model.py
βββ configs/
β βββ train.yaml
β βββ logging.yaml
βββ data/
β βββ raw/
β βββ processed/
βββ models/
βββ artifacts/
βββ tests/
βββ .github/workflows/
β βββ ci.yml
β βββ deploy-hf-space.yml
β βββ keepalive-hf-space.yml
βββ Dockerfile
βββ docker-compose.yml
βββ requirements.txt
βββ pytest.ini
Tech Stack
- Python 3.11
- Pandas, NumPy, scikit-learn, imbalanced-learn, XGBoost
- MLflow
- FastAPI + Pydantic
- Docker + Docker Compose
- GitHub Actions
- Hugging Face Spaces (Docker SDK)
API Endpoints
GET /health: Service and model readinessGET /metrics: Runtime operational countersPOST /predict: Single transaction predictionPOST /predict/batch: Batch transaction predictionsGET /docs: Swagger UI
Example: Single Prediction
BASE="https://thasvithu-fraud-detection-mlops-api.hf.space"
curl -X POST "$BASE/predict" \
-H "Content-Type: application/json" \
-d '{
"Time": 0,
"Amount": 149.62,
"V1": -1.359807, "V2": -0.072781, "V3": 2.536347, "V4": 1.378155,
"V5": -0.338321, "V6": 0.462388, "V7": 0.239599, "V8": 0.098698,
"V9": 0.363787, "V10": 0.090794, "V11": -0.551600, "V12": -0.617801,
"V13": -0.991390, "V14": -0.311169, "V15": 1.468177, "V16": -0.470401,
"V17": 0.207971, "V18": 0.025791, "V19": 0.403993, "V20": 0.251412,
"V21": -0.018307, "V22": 0.277838, "V23": -0.110474, "V24": 0.066928,
"V25": 0.128539, "V26": -0.189115, "V27": 0.133558, "V28": -0.021053
}'
Local Setup
Prerequisites
- Python 3.11+
uv- Docker (optional, for container run)
Install
uv pip install -r requirements.txt
Train
uv run python -m src.train
Test
uv run pytest
Run API
uv run uvicorn api.app:app --reload --host 0.0.0.0 --port 8000
Docker Usage
Build
docker build -t fraud-detection-api:latest .
Run
docker run --rm -p 8000:8000 fraud-detection-api:latest
Compose
docker compose up --build
Quality Gates
- Test coverage enforced via
pytest.ini - Minimum coverage:
>= 80%acrosssrc+api - Current status: passing (see GitHub Actions)
Monitoring and Operations
Runtime metrics exposed by /metrics:
total_requestserror_counterror_ratetotal_predictionsfraud_predictionsfraud_prediction_rateavg_latency_ms
Request-level observability:
X-Request-IDX-Process-Time-Ms- Structured JSON logs for request and prediction events
GitHub Actions Workflows
ci.yml: test + coverage + image build (+ optional webhook deploy)deploy-hf-space.yml: syncmainto Hugging Face Spacekeepalive-hf-space.yml: scheduled pings to reduce Space inactivity sleep
Required GitHub Secrets
For Hugging Face deploy:
HF_TOKENHF_SPACE_REPO(format:username/space-name)
For HF keepalive:
HF_SPACE_URL
Optional webhook deploy:
DEPLOY_WEBHOOK_URL
Milestone Status
All planned phases (0-9) are complete:
- Foundation
- Data validation
- Preprocessing
- Training + MLflow tracking
- Evaluation + threshold tuning
- FastAPI inference service
- Testing + quality gates
- Containerization
- CI/CD automation
- Monitoring and operations
License
MIT (see LICENSE)