UIDAI / README.md
LovnishVerma's picture
Update README.md
2c93382 verified
---
title: UIDAI Project S.A.T.A.R.K
emoji: πŸš€
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Data-Driven Innovation for Aadhaar
---
# πŸ›‘οΈ Project S.A.T.A.R.K: AI-Powered Fraud Detection for UIDAI
[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://huggingface.co/spaces/lovnishverma/UIDAI)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
> **Context-Aware Anomaly Detection System for Aadhaar Enrolment Centers** > **Team ID:** UIDAI_4571 | **Theme:** Data-Driven Innovation for Aadhaar
---
## 🎯 Quick Links
- **πŸ“Š Live Analysis Notebook**: [Open in Google Colab](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing)
- **πŸš€ Live Dashboard**: [Hugging Face Spaces](https://huggingface.co/spaces/lovnishverma/UIDAI)
- **πŸ“– Project Report**: [View PDF](Final-Project-Report.pdf)
- **πŸ’» Source Code**: Available in this repository
---
## 🎯 Overview
**Project S.A.T.A.R.K** (Statistical Anomaly Tracking & Aadhaar Risk Kit) is a revolutionary fraud detection system designed to solve the critical "Accuracy vs. Fairness" trade-off in Aadhaar vigilance.
### The Problem
India's demographic diversity makes global rules ineffective:
- ❌ **Strict Rules:** Flag legitimate activities in tribal belts (False Positives).
- ❌ **Lenient Rules:** Miss sophisticated fraud in metropolitan areas (False Negatives).
### Our Innovation: District Normalization
Instead of using a national average, S.A.T.A.R.K compares each enrolment center against its **local district baseline**.
- **Example:** In a tribal district where late enrolment is common (Avg: 40%), a center doing 90% is flagged. But in a city where 90% is normal, it is marked safe.
---
## ✨ Key Features
### 🧠 The "Context-Aware" AI Engine
- **Algorithm**: Isolation Forest (Unsupervised Learning)
- **Smart Logic**: Detects anomalies relative to local geography.
- **Capabilities**: Identifies "Ghost IDs", "Sunday Surges" (Illegal Camps), and "Mass Update Operations".
### πŸ“Š The Vigilance Dashboard
- **Geospatial Intelligence**: Interactive Heatmap of High-Risk Centers.
- **Actionable Insights**: "Priority Action List" exportable for field agents.
- **Evidence-Based**: Charts proving *why* a center was flagged (e.g., Weekend Activity vs. Weekday).
### πŸ“₯ Smart Data Ingestion
- **Automated**: Recursively fetches and merges fragmented CSV chunks.
- **Robust**: Handles massive datasets without data loss using Outer Joins.
---
## πŸš€ Quick Start
### **Option 1: Run Analysis (Google Colab)**
To see the Feature Engineering and Model Training in action:
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing)
1. Open the Notebook.
2. Run all cells to process the raw data.
3. Download the generated `analyzed_aadhaar_data.csv`.
### **Option 2: Run Dashboard (Local)**
**Prerequisites:** Python 3.8+, pip
1. **Clone the repository**
```bash
git clone [https://huggingface.co/spaces/lovnishverma/UIDAI](https://huggingface.co/spaces/lovnishverma/UIDAI)
cd UIDAI
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Launch the App**
```bash
streamlit run app.py
```
4. **Access the Dashboard**
Open `http://localhost:8501` in your browser.
---
## πŸ“ Project Structure
```
UIDAI/
β”œβ”€β”€ README.md # This documentation
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile # Container configuration
β”œβ”€β”€ app.py # Streamlit Dashboard Code
β”œβ”€β”€ UIDAI_4571_(PROJECT_S_A_T_A_R_K_AI).ipynb # Main Analysis Notebook
β”œβ”€β”€ analyzed_aadhaar_data.csv # Processed Data for Dashboard
β”œβ”€β”€ Final-Project-Report.pdf # Complete Project Documentation
└── assets/ # Images and logos
```
---
## 🧠 Technical Architecture
### The Pipeline
1. **Ingestion**: `SmartLoader` class merges fragmented CSVs.
2. **Context Engine**: Calculates `ratio_deviation` (Center vs. District).
3. **AI Model**: `IsolationForest` detects statistical outliers.
4. **Visualization**: Streamlit app renders the `RISK_SCORE` on maps.
### Core Risk Signals
| Feature | Logic | Detects |
| --- | --- | --- |
| **Ratio Deviation** | `(Center_Ratio - District_Avg)` | Ghost IDs |
| **Weekend Spike** | `Activity on Sunday / Normal Day` | Illegal Camps |
| **Mismatch Score** | ` | Bio - Demo |
| **Volume Anomaly** | `Total_Activity > 99th Percentile` | Mass Operations |
---
## πŸ“Š Dashboard Preview
### 1. Geographic Heatmap
Instantly spot high-risk clusters across India.
*(See `assets/` for screenshots)*
### 2. Priority Action List
Downloadable CSV for vigilance officers containing only the top 1% critical cases.
### 3. AI Insights Panel
"Why is this flagged?" - The AI explains its decision (e.g., *"Flagged due to 500% spike in weekend activity"*).
---
## πŸ‘₯ Team UIDAI_4571
**Team Leader:** Aman Choudhary (NIELIT Ropar)
**Team Member:** Prateek Dhar Dwivedi (NIELIT Ropar)
**Mentor:** Lovnish Verma (Project Engineer, NIELIT Ropar)
**Competition:** UIDAI Hackathon 2026
**Submission Date:** January 2026
---
## πŸ“ License
This project is open-source under the [MIT License](https://www.google.com/search?q=LICENSE).
---
<div align="center">
<strong>Project S.A.T.A.R.K.</strong>
<em>Statistical Anomaly Tracking & Aadhaar Risk Kit</em>
Built with ❀️ for a safer, inclusive Digital India.
</div>