# Usage Guide ## Overview This guide provides step-by-step instructions on how to use the Credit Card Anomaly Detection System. ## Getting Started ### 1. Access the Application Open your web browser and navigate to: ``` http://localhost:7860 ``` You will see the main dashboard page with: - Navigation sidebar - KPI cards showing statistics - Anomalies table - Charts for data visualization --- ## Step-by-Step Workflow ### Step 1: Upload Your Data #### Option A: Upload Page 1. Click **"Upload Data"** in the sidebar 2. Drag and drop your file into the upload area OR click **"Browse Files"** 3. Select your data file (CSV, Excel, JSON, or Parquet) 4. Click **"Upload File"** button 5. Wait for upload completion 6. Click **"Go to Dashboard"** to proceed #### Supported File Formats - **CSV** (.csv) - Recommended - **Excel** (.xlsx, .xls) - **JSON** (.json) - **Parquet** (.parquet) #### Required Data Columns Your data file should contain at least: - **Transaction ID** - Unique identifier for each transaction - **User ID** - Identifier for the user making the transaction - **Amount** - Transaction amount (numeric) - **Timestamp** - Date and time of the transaction - **Merchant Category** - Category of the merchant (optional) #### Optional Columns - **Location** - Location where transaction occurred - **Other custom fields** - Will be preserved but not used for detection #### Sample Data Format ```csv Transaction ID,User ID,Amount,Timestamp,Merchant Category,Location TX001,USER001,150.50,2024-01-15 10:30:00,Grocery,New York TX002,USER001,89.99,2024-01-15 14:22:00,Restaurant,Los Angeles TX003,USER002,1250.00,2024-01-16 09:15:00,Electronics,Chicago ``` #### File Size Limit Maximum file size: 16MB --- ### Step 2: Automatic Training and Prediction After uploading your data: - The system **automatically trains** using Isolation Forest (default model) - The system **automatically predicts** anomalies - You will see: - Number of rows uploaded - Number of columns detected - Number of anomalies found - Model type used - Contamination parameter --- ### Step 3: View Results on Dashboard #### KPI Cards The dashboard shows: - **Total Transactions** - Total number of transactions in your dataset - **Anomalies Detected** - Number of suspicious transactions flagged - **Anomaly Rate** - Percentage of transactions flagged as anomalies - **Unique Users** - Number of unique users in the dataset #### Anomalies Table The table displays all detected anomalies with: - Transaction ID - User ID - Amount - Timestamp - Merchant Category - Anomaly Score - Explanation of why it was flagged #### Filtering Anomalies - **Filter by User ID** - Type a user ID to see their anomalies - **Filter by Category** - Select a merchant category - Clear filters using the reset button --- ### Step 4: Manual Retraining (Optional) If you want to try different models: 1. On the Dashboard, locate the **Model Controls** section 2. Select **Model Type**: - **Isolation Forest** (default) - Good for high-dimensional data - **LOF (Local Outlier Factor)** - Good for density-based detection 3. Adjust **Contamination** parameter (0.01 to 0.5) - Lower values = fewer anomalies detected - Higher values = more anomalies detected 4. Click **"Train Model"** 5. Wait for training to complete 6. Click **"Predict Anomalies"** to detect anomalies with new model --- ### Step 5: Download Anomaly Report 1. Click **"Download Report"** button above the anomalies table 2. A CSV file will download containing: - All detected anomalies - Transaction details - Anomaly scores - Explanations --- ### Step 6: View Analytics 1. Click **"Analytics"** in the sidebar 2. Explore various visualizations: #### Transaction Distribution - Bar chart showing distribution of transaction amounts - View spending patterns across different amount ranges #### Category Breakdown - Donut chart showing transactions by merchant category - See which categories have the most activity #### Time Analysis - Line chart showing transaction patterns by time of day - Identify peak transaction hours #### User Statistics - Horizontal bar chart comparing transaction counts per user - See which users have the most activity #### Score Distribution - Histogram showing distribution of anomaly scores - Understand the overall anomaly score distribution #### Feature Importance - Bar chart showing which features contribute most to anomaly detection - Understand what drives anomaly detection --- ## Advanced Features ### Auto-Training The system automatically: - Trains Isolation Forest model immediately after upload - Predicts anomalies using the trained model - Displays results without manual intervention ### Manual Retraining You can: - Change the model type (Isolation Forest or LOF) - Adjust contamination parameter - Retrain with different settings - Compare results ### Model Comparison - Try both Isolation Forest and LOF - Compare anomaly detection results - Choose the model that works best for your data --- ## Tips for Best Results ### Data Quality - Ensure your data has consistent formatting - Remove duplicate transactions - Handle missing values appropriately - Use consistent date/time formats ### Parameter Tuning - Start with default parameters - Adjust contamination based on your expected anomaly rate - Lower contamination (0.01-0.05) for very strict detection - Higher contamination (0.1-0.2) for more lenient detection ### Model Selection - **Isolation Forest** - Best for general-purpose anomaly detection - **LOF** - Best when anomalies are locally sparse --- ## Common Workflows ### Workflow 1: Quick Anomaly Check 1. Upload your data 2. Wait for auto-training to complete 3. View anomalies in the table 4. Download report if needed ### Workflow 2: Detailed Analysis 1. Upload your data 2. Let auto-training complete 3. Review KPI cards and anomalies table 4. Navigate to Analytics page 5. Explore all charts and visualizations 6. Download anomaly report ### Workflow 3: Model Experimentation 1. Upload your data 2. Note default results (Isolation Forest) 3. Change model to LOF 4. Retrain and predict 5. Compare results 6. Adjust contamination parameter 7. Retrain and predict again 8. Choose best configuration --- ## Troubleshooting ### Issue: No anomalies detected **Solution**: Increase contamination parameter and retrain ### Issue: Too many anomalies detected **Solution**: Decrease contamination parameter and retrain ### Issue: Upload failed **Solution**: Ensure file format is correct and file size is under 16MB ### Issue: Charts not displaying **Solution**: Refresh the page and check internet connection (Chart.js loads from CDN) ### Issue: Model training takes too long **Solution**: This is normal for large datasets. Wait for completion. --- ## Next Steps For more technical details, refer to: - [Technical Documentation](TECHNICAL_GUIDE.md) - [API Reference](API_REFERENCE.md)