Spaces:
Sleeping
Sleeping
| # 🛠️ Development & MLOps Guide | |
| This guide provides detailed instructions on setting up the environment, running experiments, and developing the PayShield-ML system. | |
| --- | |
| ## 🏗️ 1. Environment Setup | |
| The project uses `uv` for lightning-fast Python package and project management. | |
| ### Prerequisites | |
| - [uv](https://github.com/astral-sh/uv) installed | |
| - Docker & Docker Compose | |
| - Redis (can be run via Docker) | |
| ### Installation | |
| ```bash | |
| # Sync dependencies and create virtual environment | |
| uv sync | |
| # Activate the environment | |
| source .venv/bin/activate | |
| ``` | |
| --- | |
| ## 📊 2. MLflow Tracking | |
| We use MLflow to track hyperparameters, metrics, and model artifacts. | |
| ### Start MLflow Server | |
| Run this in a separate terminal to view the UI: | |
| ```bash | |
| uv run mlflow ui --host 0.0.0.0 --port 5000 | |
| ``` | |
| Then access the dashboard at [http://localhost:5000](http://localhost:5000). | |
| --- | |
| ## 🚂 3. Model Training Pipeline | |
| The training script handles data ingestion, feature engineering, cross-validation, and MLflow logging. | |
| ### Basic Training | |
| ```bash | |
| uv run python src/models/train.py --data_path data/fraud_sample.csv | |
| ``` | |
| ### Advanced Training with Custom Params | |
| ```bash | |
| uv run python src/models/train.py \ | |
| --data_path data/fraud_sample.csv \ | |
| --experiment_name fraud_v2 \ | |
| --min_recall 0.85 \ | |
| --output_dir models/ | |
| ``` | |
| | Argument | Description | Default | | |
| | :--- | :--- | :--- | | |
| | `--data_path` | Path to CSV/Parquet training data | (Required) | | |
| | `--params_path` | Path to model config YAML | `configs/model_config.yaml` | | |
| | `--experiment_name` | MLflow experiment grouping | `fraud_detection` | | |
| | `--min_recall` | Target recall for threshold optimization | `0.80` | | |
| --- | |
| ## 🔌 4. Running Services Locally | |
| For rapid iteration without rebuilding Docker images. | |
| ### Start Redis (Required) | |
| ```bash | |
| docker run -d --name payshield-redis -p 6379:6379 redis:7-alpine | |
| ``` | |
| ### Start FastAPI Backend | |
| ```bash | |
| # Running with hot-reload | |
| uv run uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload | |
| ``` | |
| - **Swagger Docs:** [http://localhost:8000/docs](http://localhost:8000/docs) | |
| ### Start Streamlit Dashboard | |
| ```bash | |
| export API_URL="http://localhost:8000/v1/predict" | |
| uv run streamlit run src/frontend/app.py --server.port 8501 | |
| ``` | |
| - **Dashboard:** [http://localhost:8501](http://localhost:8501) | |
| --- | |
| ## 🐳 5. Full-Stack Development (Docker Compose) | |
| The easiest way to replicate production-like environment. | |
| ### Build and Launch | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| ### Useful Commands | |
| ```bash | |
| # Run in background | |
| docker-compose up -d | |
| # Check all service logs | |
| docker-compose logs -f | |
| # Stop and remove containers | |
| docker-compose down | |
| ``` | |
| ### Service Map | |
| | Service | Port | Endpoint | | |
| | :--- | :--- | :--- | | |
| | **API** | 8000 | `http://localhost:8000` | | |
| | **Dashboard** | 8501 | `http://localhost:8501` | | |
| | **Redis** | 6379 | `localhost:6379` | | |
| --- | |
| ## 🧪 6. Testing | |
| We use `pytest` for unit and integration tests. | |
| ```bash | |
| # Run all tests | |
| uv run pytest | |
| # Run with coverage report | |
| uv run pytest --cov=src | |
| ``` | |
| --- | |
| **Note:** Ensure you have the `models/fraud_model.pkl` and `models/threshold.json` artifacts present before starting the API. These are generated by the training pipeline. | |