Spaces:
Sleeping
Usage Guide
Overview
This guide provides step-by-step instructions on how to use the Credit Card Anomaly Detection System.
Getting Started
1. Access the Application
Open your web browser and navigate to:
http://localhost:7860
You will see the main dashboard page with:
- Navigation sidebar
- KPI cards showing statistics
- Anomalies table
- Charts for data visualization
Step-by-Step Workflow
Step 1: Upload Your Data
Option A: Upload Page
- Click "Upload Data" in the sidebar
- Drag and drop your file into the upload area OR click "Browse Files"
- Select your data file (CSV, Excel, JSON, or Parquet)
- Click "Upload File" button
- Wait for upload completion
- Click "Go to Dashboard" to proceed
Supported File Formats
- CSV (.csv) - Recommended
- Excel (.xlsx, .xls)
- JSON (.json)
- Parquet (.parquet)
Required Data Columns
Your data file should contain at least:
- Transaction ID - Unique identifier for each transaction
- User ID - Identifier for the user making the transaction
- Amount - Transaction amount (numeric)
- Timestamp - Date and time of the transaction
- Merchant Category - Category of the merchant (optional)
Optional Columns
- Location - Location where transaction occurred
- Other custom fields - Will be preserved but not used for detection
Sample Data Format
Transaction ID,User ID,Amount,Timestamp,Merchant Category,Location
TX001,USER001,150.50,2024-01-15 10:30:00,Grocery,New York
TX002,USER001,89.99,2024-01-15 14:22:00,Restaurant,Los Angeles
TX003,USER002,1250.00,2024-01-16 09:15:00,Electronics,Chicago
File Size Limit
Maximum file size: 16MB
Step 2: Automatic Training and Prediction
After uploading your data:
- The system automatically trains using Isolation Forest (default model)
- The system automatically predicts anomalies
- You will see:
- Number of rows uploaded
- Number of columns detected
- Number of anomalies found
- Model type used
- Contamination parameter
Step 3: View Results on Dashboard
KPI Cards
The dashboard shows:
- Total Transactions - Total number of transactions in your dataset
- Anomalies Detected - Number of suspicious transactions flagged
- Anomaly Rate - Percentage of transactions flagged as anomalies
- Unique Users - Number of unique users in the dataset
Anomalies Table
The table displays all detected anomalies with:
- Transaction ID
- User ID
- Amount
- Timestamp
- Merchant Category
- Anomaly Score
- Explanation of why it was flagged
Filtering Anomalies
- Filter by User ID - Type a user ID to see their anomalies
- Filter by Category - Select a merchant category
- Clear filters using the reset button
Step 4: Manual Retraining (Optional)
If you want to try different models:
- On the Dashboard, locate the Model Controls section
- Select Model Type:
- Isolation Forest (default) - Good for high-dimensional data
- LOF (Local Outlier Factor) - Good for density-based detection
- Adjust Contamination parameter (0.01 to 0.5)
- Lower values = fewer anomalies detected
- Higher values = more anomalies detected
- Click "Train Model"
- Wait for training to complete
- Click "Predict Anomalies" to detect anomalies with new model
Step 5: Download Anomaly Report
- Click "Download Report" button above the anomalies table
- A CSV file will download containing:
- All detected anomalies
- Transaction details
- Anomaly scores
- Explanations
Step 6: View Analytics
- Click "Analytics" in the sidebar
- Explore various visualizations:
Transaction Distribution
- Bar chart showing distribution of transaction amounts
- View spending patterns across different amount ranges
Category Breakdown
- Donut chart showing transactions by merchant category
- See which categories have the most activity
Time Analysis
- Line chart showing transaction patterns by time of day
- Identify peak transaction hours
User Statistics
- Horizontal bar chart comparing transaction counts per user
- See which users have the most activity
Score Distribution
- Histogram showing distribution of anomaly scores
- Understand the overall anomaly score distribution
Feature Importance
- Bar chart showing which features contribute most to anomaly detection
- Understand what drives anomaly detection
Advanced Features
Auto-Training
The system automatically:
- Trains Isolation Forest model immediately after upload
- Predicts anomalies using the trained model
- Displays results without manual intervention
Manual Retraining
You can:
- Change the model type (Isolation Forest or LOF)
- Adjust contamination parameter
- Retrain with different settings
- Compare results
Model Comparison
- Try both Isolation Forest and LOF
- Compare anomaly detection results
- Choose the model that works best for your data
Tips for Best Results
Data Quality
- Ensure your data has consistent formatting
- Remove duplicate transactions
- Handle missing values appropriately
- Use consistent date/time formats
Parameter Tuning
- Start with default parameters
- Adjust contamination based on your expected anomaly rate
- Lower contamination (0.01-0.05) for very strict detection
- Higher contamination (0.1-0.2) for more lenient detection
Model Selection
- Isolation Forest - Best for general-purpose anomaly detection
- LOF - Best when anomalies are locally sparse
Common Workflows
Workflow 1: Quick Anomaly Check
- Upload your data
- Wait for auto-training to complete
- View anomalies in the table
- Download report if needed
Workflow 2: Detailed Analysis
- Upload your data
- Let auto-training complete
- Review KPI cards and anomalies table
- Navigate to Analytics page
- Explore all charts and visualizations
- Download anomaly report
Workflow 3: Model Experimentation
- Upload your data
- Note default results (Isolation Forest)
- Change model to LOF
- Retrain and predict
- Compare results
- Adjust contamination parameter
- Retrain and predict again
- Choose best configuration
Troubleshooting
Issue: No anomalies detected
Solution: Increase contamination parameter and retrain
Issue: Too many anomalies detected
Solution: Decrease contamination parameter and retrain
Issue: Upload failed
Solution: Ensure file format is correct and file size is under 16MB
Issue: Charts not displaying
Solution: Refresh the page and check internet connection (Chart.js loads from CDN)
Issue: Model training takes too long
Solution: This is normal for large datasets. Wait for completion.
Next Steps
For more technical details, refer to: