Spaces:

sibikrish
/

PayShield-ML

Sleeping

App Files Files Community

PayShield-ML / .agent /rules /architecture.md

Sibi Krishnamoorthy

prod

8a08300 4 months ago

preview code

raw

history blame contribute delete

5.82 kB

metadata

trigger: always_on

Here is the comprehensive architecture.md file for your repository. It documents the high-level system design, data flow, and infrastructure decisions, serving as a critical artifact for technical interviews and system design reviews.

🏗️ System Architecture: Real-Time Fraud Detection

This document outlines the end-to-end architecture of the Fraud Detection System, designed for low-latency inference (<50ms), high throughput, and explainability.

The system follows a Lambda Architecture pattern, splitting workflows into an Offline Training Layer (batch processing) and an Online Serving Layer (real-time inference).

1. High-Level Diagram

graph TD
    User[Payment Gateway] -->|POST /predict| API[FastAPI Inference Service]
    
    subgraph "Online Serving Layer"
        API -->|1. Get Velocity| Redis[(Redis Feature Store)]
        API -->|2. Compute Features| Preproc[Scikit-Learn Pipeline]
        Preproc -->|3. Inference| Model[XGBoost Classifier]
        Model -->|4. Log Prediction| Logger[Async Logger]
    end
    
    subgraph "Offline Training Layer"
        DW[(Data Warehouse)] -->|Batch Load| Trainer[Training Pipeline]
        Trainer -->|Track Experiments| MLflow[MLflow Server]
        Trainer -->|Save Artifact| Registry[Model Registry]
        Registry -->|Load .pkl| API
    end
    
    subgraph "Explainability & Monitoring"
        Logger -->|Drift Analysis| Dashboard[Streamlit / Grafana]
        API -->|Request Waterfall| SHAP[SHAP Explainer]
    end

2. Component Breakdown

🟢 Online Serving Layer (Real-Time)

The critical path for live transactions. Designed for sub-50ms latency.

FastAPI Service: Asynchronous Python web server. Handles request validation (Pydantic) and orchestrates the inference flow.
Redis Feature Store:
Role: Solves the "Stateful Feature" problem.
Implementation: Uses Redis Sorted Sets (ZSET) to maintain a rolling window of transaction counts for the last 24 hours.
Latency: <2ms lookups via pipelining.
Inference Engine:
Model: XGBoost Classifier (serialized via Joblib).
Pipeline: ColumnTransformer (WOEEncoder + RobustScaler) ensures raw input is transformed exactly as it was during training.
Shadow Mode: A configuration toggle allowing the model to run silently alongside legacy rules for A/B testing.

🔵 Offline Training Layer (Batch)

Where models are built, validated, and versioned.

Data Ingestion: Loads historical transaction logs (CSV/Parquet).
Feature Engineering:
Calculates Haversine distances.
Generates Cyclical time encodings (hour_sin, hour_cos).
Computes historical aggregates for the Feature Store.
refer notebook markdown for more information at path notebook/credit-card-fraud-detection.md
Experiment Tracking (MLflow):
Logs hyperparameters (max_depth, learning_rate).
Logs metrics: Precision, Recall, PR-AUC.
Stores the model artifact and the optimized decision threshold (e.g., 0.895).

🟣 Explainability Layer (Compliance)

Ensures decisions are transparent and auditable.

SHAP (SHapley Additive exPlanations):
Global: Summary plots to understand model drivers.
Local: Waterfall plots generated on-demand for specific high-risk blocks.
Optimization: Uses the TreeExplainer with the "JSON Serialization Patch" to handle XGBoost 2.0+ metadata compatibility.

3. Data Flow Lifecycle

Step 1: Request Ingestion

The Payment Gateway sends a raw transaction payload:

{
  "user_id": "u8312",
  "amt": 150.00,
  "lat": 40.7128,
  "long": -74.0060,
  "category": "grocery_pos",
  "job": "systems_analyst"
}

Step 2: Real-Time Feature Enrichment

The API queries Redis to fetch dynamic context that isn't in the payload:

Query: ZCARD user:u8312:tx_history (Transactions in last 24h)
Result: 5 (Velocity)

Step 3: Preprocessing & Inference

The combined vector (Raw + Redis Features) is passed to the Pipeline:

Imputation: Handle missing values.
Encoding: category -> Weight of Evidence (Float).
Scaling: amt -> RobustScaler (centered/scaled).
Prediction: XGBoost outputs probability 0.982.

Step 4: Decision & Action

Threshold Check: 0.982 > 0.895 → FRAUD.
Shadow Mode Check: If enabled, return APPROVE but log [SHADOW_BLOCK].
Response:

{
  "status": "BLOCK",
  "risk_score": 0.982,
  "latency_ms": 35
}

4. Deployment Strategy

Shadow Deployment (Dark Launch)

To mitigate risk, new model versions are deployed in "Shadow Mode" before blocking real money.

Deploy: Version v2 is deployed to production.
Listen: It receives 100% of traffic but has no write permissions (cannot decline cards).
Compare: We compare v2's "Virtual Blocks" against the v1 "Actual Blocks" and customer complaints.
Promote: If Precision > 95% and complaints == 0, toggle Shadow Mode OFF.

5. Technology Stack

Component	Technology	Rationale
Language	Python 3.10+	Standard for ML ecosystem.
**Package Manager	astral-uv	An extremely fast Python package and project manager
API Framework	FastAPI	Async native, high performance, auto-documentation.
Model	XGBoost	Best-in-class for tabular fraud data.
Feature Store	Redis	Sub-millisecond latency for sliding windows.
Container	Docker	Reproducible environments.
Orchestration	Docker Compose	Local development and testing.
Tracking	MLflow	Experiment management and artifact versioning.
Frontend	Streamlit	Rapid prototyping of analyst dashboards.