Credit-Card-Anomaly / docs /USAGE_GUIDE.md
Zayeemk's picture
Rename USAGE_GUIDE.md to docs/USAGE_GUIDE.md
34a6484 verified

Usage Guide

Overview

This guide provides step-by-step instructions on how to use the Credit Card Anomaly Detection System.

Getting Started

1. Access the Application

Open your web browser and navigate to:

http://localhost:7860

You will see the main dashboard page with:

  • Navigation sidebar
  • KPI cards showing statistics
  • Anomalies table
  • Charts for data visualization

Step-by-Step Workflow

Step 1: Upload Your Data

Option A: Upload Page

  1. Click "Upload Data" in the sidebar
  2. Drag and drop your file into the upload area OR click "Browse Files"
  3. Select your data file (CSV, Excel, JSON, or Parquet)
  4. Click "Upload File" button
  5. Wait for upload completion
  6. Click "Go to Dashboard" to proceed

Supported File Formats

  • CSV (.csv) - Recommended
  • Excel (.xlsx, .xls)
  • JSON (.json)
  • Parquet (.parquet)

Required Data Columns

Your data file should contain at least:

  • Transaction ID - Unique identifier for each transaction
  • User ID - Identifier for the user making the transaction
  • Amount - Transaction amount (numeric)
  • Timestamp - Date and time of the transaction
  • Merchant Category - Category of the merchant (optional)

Optional Columns

  • Location - Location where transaction occurred
  • Other custom fields - Will be preserved but not used for detection

Sample Data Format

Transaction ID,User ID,Amount,Timestamp,Merchant Category,Location
TX001,USER001,150.50,2024-01-15 10:30:00,Grocery,New York
TX002,USER001,89.99,2024-01-15 14:22:00,Restaurant,Los Angeles
TX003,USER002,1250.00,2024-01-16 09:15:00,Electronics,Chicago

File Size Limit

Maximum file size: 16MB


Step 2: Automatic Training and Prediction

After uploading your data:

  • The system automatically trains using Isolation Forest (default model)
  • The system automatically predicts anomalies
  • You will see:
    • Number of rows uploaded
    • Number of columns detected
    • Number of anomalies found
    • Model type used
    • Contamination parameter

Step 3: View Results on Dashboard

KPI Cards

The dashboard shows:

  • Total Transactions - Total number of transactions in your dataset
  • Anomalies Detected - Number of suspicious transactions flagged
  • Anomaly Rate - Percentage of transactions flagged as anomalies
  • Unique Users - Number of unique users in the dataset

Anomalies Table

The table displays all detected anomalies with:

  • Transaction ID
  • User ID
  • Amount
  • Timestamp
  • Merchant Category
  • Anomaly Score
  • Explanation of why it was flagged

Filtering Anomalies

  • Filter by User ID - Type a user ID to see their anomalies
  • Filter by Category - Select a merchant category
  • Clear filters using the reset button

Step 4: Manual Retraining (Optional)

If you want to try different models:

  1. On the Dashboard, locate the Model Controls section
  2. Select Model Type:
    • Isolation Forest (default) - Good for high-dimensional data
    • LOF (Local Outlier Factor) - Good for density-based detection
  3. Adjust Contamination parameter (0.01 to 0.5)
    • Lower values = fewer anomalies detected
    • Higher values = more anomalies detected
  4. Click "Train Model"
  5. Wait for training to complete
  6. Click "Predict Anomalies" to detect anomalies with new model

Step 5: Download Anomaly Report

  1. Click "Download Report" button above the anomalies table
  2. A CSV file will download containing:
    • All detected anomalies
    • Transaction details
    • Anomaly scores
    • Explanations

Step 6: View Analytics

  1. Click "Analytics" in the sidebar
  2. Explore various visualizations:

Transaction Distribution

  • Bar chart showing distribution of transaction amounts
  • View spending patterns across different amount ranges

Category Breakdown

  • Donut chart showing transactions by merchant category
  • See which categories have the most activity

Time Analysis

  • Line chart showing transaction patterns by time of day
  • Identify peak transaction hours

User Statistics

  • Horizontal bar chart comparing transaction counts per user
  • See which users have the most activity

Score Distribution

  • Histogram showing distribution of anomaly scores
  • Understand the overall anomaly score distribution

Feature Importance

  • Bar chart showing which features contribute most to anomaly detection
  • Understand what drives anomaly detection

Advanced Features

Auto-Training

The system automatically:

  • Trains Isolation Forest model immediately after upload
  • Predicts anomalies using the trained model
  • Displays results without manual intervention

Manual Retraining

You can:

  • Change the model type (Isolation Forest or LOF)
  • Adjust contamination parameter
  • Retrain with different settings
  • Compare results

Model Comparison

  • Try both Isolation Forest and LOF
  • Compare anomaly detection results
  • Choose the model that works best for your data

Tips for Best Results

Data Quality

  • Ensure your data has consistent formatting
  • Remove duplicate transactions
  • Handle missing values appropriately
  • Use consistent date/time formats

Parameter Tuning

  • Start with default parameters
  • Adjust contamination based on your expected anomaly rate
  • Lower contamination (0.01-0.05) for very strict detection
  • Higher contamination (0.1-0.2) for more lenient detection

Model Selection

  • Isolation Forest - Best for general-purpose anomaly detection
  • LOF - Best when anomalies are locally sparse

Common Workflows

Workflow 1: Quick Anomaly Check

  1. Upload your data
  2. Wait for auto-training to complete
  3. View anomalies in the table
  4. Download report if needed

Workflow 2: Detailed Analysis

  1. Upload your data
  2. Let auto-training complete
  3. Review KPI cards and anomalies table
  4. Navigate to Analytics page
  5. Explore all charts and visualizations
  6. Download anomaly report

Workflow 3: Model Experimentation

  1. Upload your data
  2. Note default results (Isolation Forest)
  3. Change model to LOF
  4. Retrain and predict
  5. Compare results
  6. Adjust contamination parameter
  7. Retrain and predict again
  8. Choose best configuration

Troubleshooting

Issue: No anomalies detected

Solution: Increase contamination parameter and retrain

Issue: Too many anomalies detected

Solution: Decrease contamination parameter and retrain

Issue: Upload failed

Solution: Ensure file format is correct and file size is under 16MB

Issue: Charts not displaying

Solution: Refresh the page and check internet connection (Chart.js loads from CDN)

Issue: Model training takes too long

Solution: This is normal for large datasets. Wait for completion.


Next Steps

For more technical details, refer to: