Spaces:

sibikrish
/

PayShield-ML

Sleeping

App Files Files Community

PayShield-ML / docs /DEVELOPMENT.md

Sibi Krishnamoorthy

prod

8a08300 25 days ago

preview code

raw

history blame contribute delete

3.21 kB

🛠️ Development & MLOps Guide

This guide provides detailed instructions on setting up the environment, running experiments, and developing the PayShield-ML system.

🏗️ 1. Environment Setup

The project uses uv for lightning-fast Python package and project management.

Prerequisites

uv installed
Docker & Docker Compose
Redis (can be run via Docker)

Installation

# Sync dependencies and create virtual environment
uv sync

# Activate the environment
source .venv/bin/activate

📊 2. MLflow Tracking

We use MLflow to track hyperparameters, metrics, and model artifacts.

Start MLflow Server

Run this in a separate terminal to view the UI:

uv run mlflow ui --host 0.0.0.0 --port 5000

Then access the dashboard at http://localhost:5000.

🚂 3. Model Training Pipeline

The training script handles data ingestion, feature engineering, cross-validation, and MLflow logging.

Basic Training

uv run python src/models/train.py --data_path data/fraud_sample.csv

Advanced Training with Custom Params

uv run python src/models/train.py \
    --data_path data/fraud_sample.csv \
    --experiment_name fraud_v2 \
    --min_recall 0.85 \
    --output_dir models/

Argument	Description	Default
`--data_path`	Path to CSV/Parquet training data	(Required)
`--params_path`	Path to model config YAML	`configs/model_config.yaml`
`--experiment_name`	MLflow experiment grouping	`fraud_detection`
`--min_recall`	Target recall for threshold optimization	`0.80`

🔌 4. Running Services Locally

For rapid iteration without rebuilding Docker images.

Start Redis (Required)

docker run -d --name payshield-redis -p 6379:6379 redis:7-alpine

Start FastAPI Backend

# Running with hot-reload
uv run uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload

Swagger Docs: http://localhost:8000/docs

Start Streamlit Dashboard

export API_URL="http://localhost:8000/v1/predict"
uv run streamlit run src/frontend/app.py --server.port 8501

Dashboard: http://localhost:8501

🐳 5. Full-Stack Development (Docker Compose)

The easiest way to replicate production-like environment.

Build and Launch

docker-compose up --build

Useful Commands

# Run in background
docker-compose up -d

# Check all service logs
docker-compose logs -f

# Stop and remove containers
docker-compose down

Service Map

Service	Port	Endpoint
API	8000	`http://localhost:8000`
Dashboard	8501	`http://localhost:8501`
Redis	6379	`localhost:6379`

🧪 6. Testing

We use pytest for unit and integration tests.

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src

Note: Ensure you have the models/fraud_model.pkl and models/threshold.json artifacts present before starting the API. These are generated by the training pipeline.