Spaces:
Sleeping
Sleeping
| # Usage Guide | |
| ## Overview | |
| This guide provides step-by-step instructions on how to use the Credit Card Anomaly Detection System. | |
| ## Getting Started | |
| ### 1. Access the Application | |
| Open your web browser and navigate to: | |
| ``` | |
| http://localhost:7860 | |
| ``` | |
| You will see the main dashboard page with: | |
| - Navigation sidebar | |
| - KPI cards showing statistics | |
| - Anomalies table | |
| - Charts for data visualization | |
| --- | |
| ## Step-by-Step Workflow | |
| ### Step 1: Upload Your Data | |
| #### Option A: Upload Page | |
| 1. Click **"Upload Data"** in the sidebar | |
| 2. Drag and drop your file into the upload area OR click **"Browse Files"** | |
| 3. Select your data file (CSV, Excel, JSON, or Parquet) | |
| 4. Click **"Upload File"** button | |
| 5. Wait for upload completion | |
| 6. Click **"Go to Dashboard"** to proceed | |
| #### Supported File Formats | |
| - **CSV** (.csv) - Recommended | |
| - **Excel** (.xlsx, .xls) | |
| - **JSON** (.json) | |
| - **Parquet** (.parquet) | |
| #### Required Data Columns | |
| Your data file should contain at least: | |
| - **Transaction ID** - Unique identifier for each transaction | |
| - **User ID** - Identifier for the user making the transaction | |
| - **Amount** - Transaction amount (numeric) | |
| - **Timestamp** - Date and time of the transaction | |
| - **Merchant Category** - Category of the merchant (optional) | |
| #### Optional Columns | |
| - **Location** - Location where transaction occurred | |
| - **Other custom fields** - Will be preserved but not used for detection | |
| #### Sample Data Format | |
| ```csv | |
| Transaction ID,User ID,Amount,Timestamp,Merchant Category,Location | |
| TX001,USER001,150.50,2024-01-15 10:30:00,Grocery,New York | |
| TX002,USER001,89.99,2024-01-15 14:22:00,Restaurant,Los Angeles | |
| TX003,USER002,1250.00,2024-01-16 09:15:00,Electronics,Chicago | |
| ``` | |
| #### File Size Limit | |
| Maximum file size: 16MB | |
| --- | |
| ### Step 2: Automatic Training and Prediction | |
| After uploading your data: | |
| - The system **automatically trains** using Isolation Forest (default model) | |
| - The system **automatically predicts** anomalies | |
| - You will see: | |
| - Number of rows uploaded | |
| - Number of columns detected | |
| - Number of anomalies found | |
| - Model type used | |
| - Contamination parameter | |
| --- | |
| ### Step 3: View Results on Dashboard | |
| #### KPI Cards | |
| The dashboard shows: | |
| - **Total Transactions** - Total number of transactions in your dataset | |
| - **Anomalies Detected** - Number of suspicious transactions flagged | |
| - **Anomaly Rate** - Percentage of transactions flagged as anomalies | |
| - **Unique Users** - Number of unique users in the dataset | |
| #### Anomalies Table | |
| The table displays all detected anomalies with: | |
| - Transaction ID | |
| - User ID | |
| - Amount | |
| - Timestamp | |
| - Merchant Category | |
| - Anomaly Score | |
| - Explanation of why it was flagged | |
| #### Filtering Anomalies | |
| - **Filter by User ID** - Type a user ID to see their anomalies | |
| - **Filter by Category** - Select a merchant category | |
| - Clear filters using the reset button | |
| --- | |
| ### Step 4: Manual Retraining (Optional) | |
| If you want to try different models: | |
| 1. On the Dashboard, locate the **Model Controls** section | |
| 2. Select **Model Type**: | |
| - **Isolation Forest** (default) - Good for high-dimensional data | |
| - **LOF (Local Outlier Factor)** - Good for density-based detection | |
| 3. Adjust **Contamination** parameter (0.01 to 0.5) | |
| - Lower values = fewer anomalies detected | |
| - Higher values = more anomalies detected | |
| 4. Click **"Train Model"** | |
| 5. Wait for training to complete | |
| 6. Click **"Predict Anomalies"** to detect anomalies with new model | |
| --- | |
| ### Step 5: Download Anomaly Report | |
| 1. Click **"Download Report"** button above the anomalies table | |
| 2. A CSV file will download containing: | |
| - All detected anomalies | |
| - Transaction details | |
| - Anomaly scores | |
| - Explanations | |
| --- | |
| ### Step 6: View Analytics | |
| 1. Click **"Analytics"** in the sidebar | |
| 2. Explore various visualizations: | |
| #### Transaction Distribution | |
| - Bar chart showing distribution of transaction amounts | |
| - View spending patterns across different amount ranges | |
| #### Category Breakdown | |
| - Donut chart showing transactions by merchant category | |
| - See which categories have the most activity | |
| #### Time Analysis | |
| - Line chart showing transaction patterns by time of day | |
| - Identify peak transaction hours | |
| #### User Statistics | |
| - Horizontal bar chart comparing transaction counts per user | |
| - See which users have the most activity | |
| #### Score Distribution | |
| - Histogram showing distribution of anomaly scores | |
| - Understand the overall anomaly score distribution | |
| #### Feature Importance | |
| - Bar chart showing which features contribute most to anomaly detection | |
| - Understand what drives anomaly detection | |
| --- | |
| ## Advanced Features | |
| ### Auto-Training | |
| The system automatically: | |
| - Trains Isolation Forest model immediately after upload | |
| - Predicts anomalies using the trained model | |
| - Displays results without manual intervention | |
| ### Manual Retraining | |
| You can: | |
| - Change the model type (Isolation Forest or LOF) | |
| - Adjust contamination parameter | |
| - Retrain with different settings | |
| - Compare results | |
| ### Model Comparison | |
| - Try both Isolation Forest and LOF | |
| - Compare anomaly detection results | |
| - Choose the model that works best for your data | |
| --- | |
| ## Tips for Best Results | |
| ### Data Quality | |
| - Ensure your data has consistent formatting | |
| - Remove duplicate transactions | |
| - Handle missing values appropriately | |
| - Use consistent date/time formats | |
| ### Parameter Tuning | |
| - Start with default parameters | |
| - Adjust contamination based on your expected anomaly rate | |
| - Lower contamination (0.01-0.05) for very strict detection | |
| - Higher contamination (0.1-0.2) for more lenient detection | |
| ### Model Selection | |
| - **Isolation Forest** - Best for general-purpose anomaly detection | |
| - **LOF** - Best when anomalies are locally sparse | |
| --- | |
| ## Common Workflows | |
| ### Workflow 1: Quick Anomaly Check | |
| 1. Upload your data | |
| 2. Wait for auto-training to complete | |
| 3. View anomalies in the table | |
| 4. Download report if needed | |
| ### Workflow 2: Detailed Analysis | |
| 1. Upload your data | |
| 2. Let auto-training complete | |
| 3. Review KPI cards and anomalies table | |
| 4. Navigate to Analytics page | |
| 5. Explore all charts and visualizations | |
| 6. Download anomaly report | |
| ### Workflow 3: Model Experimentation | |
| 1. Upload your data | |
| 2. Note default results (Isolation Forest) | |
| 3. Change model to LOF | |
| 4. Retrain and predict | |
| 5. Compare results | |
| 6. Adjust contamination parameter | |
| 7. Retrain and predict again | |
| 8. Choose best configuration | |
| --- | |
| ## Troubleshooting | |
| ### Issue: No anomalies detected | |
| **Solution**: Increase contamination parameter and retrain | |
| ### Issue: Too many anomalies detected | |
| **Solution**: Decrease contamination parameter and retrain | |
| ### Issue: Upload failed | |
| **Solution**: Ensure file format is correct and file size is under 16MB | |
| ### Issue: Charts not displaying | |
| **Solution**: Refresh the page and check internet connection (Chart.js loads from CDN) | |
| ### Issue: Model training takes too long | |
| **Solution**: This is normal for large datasets. Wait for completion. | |
| --- | |
| ## Next Steps | |
| For more technical details, refer to: | |
| - [Technical Documentation](TECHNICAL_GUIDE.md) | |
| - [API Reference](API_REFERENCE.md) | |