--- title: UIDAI Project S.A.T.A.R.K emoji: 🚀 colorFrom: red colorTo: red sdk: docker app_port: 8501 tags: - streamlit pinned: false short_description: Data-Driven Innovation for Aadhaar --- # 🛡️ Project S.A.T.A.R.K: AI-Powered Fraud Detection for UIDAI [![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://huggingface.co/spaces/lovnishverma/UIDAI) [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) > **Context-Aware Anomaly Detection System for Aadhaar Enrolment Centers** > **Team ID:** UIDAI_4571 | **Theme:** Data-Driven Innovation for Aadhaar --- ## 🎯 Quick Links - **📊 Live Analysis Notebook**: [Open in Google Colab](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing) - **🚀 Live Dashboard**: [Hugging Face Spaces](https://huggingface.co/spaces/lovnishverma/UIDAI) - **📖 Project Report**: [View PDF](Final-Project-Report.pdf) - **💻 Source Code**: Available in this repository --- ## 🎯 Overview **Project S.A.T.A.R.K** (Statistical Anomaly Tracking & Aadhaar Risk Kit) is a revolutionary fraud detection system designed to solve the critical "Accuracy vs. Fairness" trade-off in Aadhaar vigilance. ### The Problem India's demographic diversity makes global rules ineffective: - ❌ **Strict Rules:** Flag legitimate activities in tribal belts (False Positives). - ❌ **Lenient Rules:** Miss sophisticated fraud in metropolitan areas (False Negatives). ### Our Innovation: District Normalization Instead of using a national average, S.A.T.A.R.K compares each enrolment center against its **local district baseline**. - **Example:** In a tribal district where late enrolment is common (Avg: 40%), a center doing 90% is flagged. But in a city where 90% is normal, it is marked safe. --- ## ✨ Key Features ### 🧠 The "Context-Aware" AI Engine - **Algorithm**: Isolation Forest (Unsupervised Learning) - **Smart Logic**: Detects anomalies relative to local geography. - **Capabilities**: Identifies "Ghost IDs", "Sunday Surges" (Illegal Camps), and "Mass Update Operations". ### 📊 The Vigilance Dashboard - **Geospatial Intelligence**: Interactive Heatmap of High-Risk Centers. - **Actionable Insights**: "Priority Action List" exportable for field agents. - **Evidence-Based**: Charts proving *why* a center was flagged (e.g., Weekend Activity vs. Weekday). ### 📥 Smart Data Ingestion - **Automated**: Recursively fetches and merges fragmented CSV chunks. - **Robust**: Handles massive datasets without data loss using Outer Joins. --- ## 🚀 Quick Start ### **Option 1: Run Analysis (Google Colab)** To see the Feature Engineering and Model Training in action: [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing) 1. Open the Notebook. 2. Run all cells to process the raw data. 3. Download the generated `analyzed_aadhaar_data.csv`. ### **Option 2: Run Dashboard (Local)** **Prerequisites:** Python 3.8+, pip 1. **Clone the repository** ```bash git clone [https://huggingface.co/spaces/lovnishverma/UIDAI](https://huggingface.co/spaces/lovnishverma/UIDAI) cd UIDAI ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Launch the App** ```bash streamlit run app.py ``` 4. **Access the Dashboard** Open `http://localhost:8501` in your browser. --- ## 📁 Project Structure ``` UIDAI/ ├── README.md # This documentation ├── requirements.txt # Python dependencies ├── Dockerfile # Container configuration ├── app.py # Streamlit Dashboard Code ├── UIDAI_4571_(PROJECT_S_A_T_A_R_K_AI).ipynb # Main Analysis Notebook ├── analyzed_aadhaar_data.csv # Processed Data for Dashboard ├── Final-Project-Report.pdf # Complete Project Documentation └── assets/ # Images and logos ``` --- ## 🧠 Technical Architecture ### The Pipeline 1. **Ingestion**: `SmartLoader` class merges fragmented CSVs. 2. **Context Engine**: Calculates `ratio_deviation` (Center vs. District). 3. **AI Model**: `IsolationForest` detects statistical outliers. 4. **Visualization**: Streamlit app renders the `RISK_SCORE` on maps. ### Core Risk Signals | Feature | Logic | Detects | | --- | --- | --- | | **Ratio Deviation** | `(Center_Ratio - District_Avg)` | Ghost IDs | | **Weekend Spike** | `Activity on Sunday / Normal Day` | Illegal Camps | | **Mismatch Score** | ` | Bio - Demo | | **Volume Anomaly** | `Total_Activity > 99th Percentile` | Mass Operations | --- ## 📊 Dashboard Preview ### 1. Geographic Heatmap Instantly spot high-risk clusters across India. *(See `assets/` for screenshots)* ### 2. Priority Action List Downloadable CSV for vigilance officers containing only the top 1% critical cases. ### 3. AI Insights Panel "Why is this flagged?" - The AI explains its decision (e.g., *"Flagged due to 500% spike in weekend activity"*). --- ## 👥 Team UIDAI_4571 **Team Leader:** Aman Choudhary (NIELIT Ropar) **Team Member:** Prateek Dhar Dwivedi (NIELIT Ropar) **Mentor:** Lovnish Verma (Project Engineer, NIELIT Ropar) **Competition:** UIDAI Hackathon 2026 **Submission Date:** January 2026 --- ## 📝 License This project is open-source under the [MIT License](https://www.google.com/search?q=LICENSE). ---
Project S.A.T.A.R.K. Statistical Anomaly Tracking & Aadhaar Risk Kit Built with ❤️ for a safer, inclusive Digital India.