---
title: UIDAI Project S.A.T.A.R.K
emoji: 🚀
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Data-Driven Innovation for Aadhaar
---

# 🛡️ Project S.A.T.A.R.K: AI-Powered Fraud Detection for UIDAI

[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://huggingface.co/spaces/lovnishverma/UIDAI)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

> **Context-Aware Anomaly Detection System for Aadhaar Enrolment Centers** > **Team ID:** UIDAI_4571 | **Theme:** Data-Driven Innovation for Aadhaar

---

## 🎯 Quick Links

- **📊 Live Analysis Notebook**: [Open in Google Colab](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing)
- **🚀 Live Dashboard**: [Hugging Face Spaces](https://huggingface.co/spaces/lovnishverma/UIDAI)
- **📖 Project Report**: [View PDF](Final-Project-Report.pdf)
- **💻 Source Code**: Available in this repository

---

## 🎯 Overview

**Project S.A.T.A.R.K** (Statistical Anomaly Tracking & Aadhaar Risk Kit) is a revolutionary fraud detection system designed to solve the critical "Accuracy vs. Fairness" trade-off in Aadhaar vigilance.

### The Problem
India's demographic diversity makes global rules ineffective:
- ❌ **Strict Rules:** Flag legitimate activities in tribal belts (False Positives).
- ❌ **Lenient Rules:** Miss sophisticated fraud in metropolitan areas (False Negatives).

### Our Innovation: District Normalization
Instead of using a national average, S.A.T.A.R.K compares each enrolment center against its **local district baseline**.
- **Example:** In a tribal district where late enrolment is common (Avg: 40%), a center doing 90% is flagged. But in a city where 90% is normal, it is marked safe.

---

## ✨ Key Features

### 🧠 The "Context-Aware" AI Engine
- **Algorithm**: Isolation Forest (Unsupervised Learning)
- **Smart Logic**: Detects anomalies relative to local geography.
- **Capabilities**: Identifies "Ghost IDs", "Sunday Surges" (Illegal Camps), and "Mass Update Operations".

### 📊 The Vigilance Dashboard
- **Geospatial Intelligence**: Interactive Heatmap of High-Risk Centers.
- **Actionable Insights**: "Priority Action List" exportable for field agents.
- **Evidence-Based**: Charts proving *why* a center was flagged (e.g., Weekend Activity vs. Weekday).

### 📥 Smart Data Ingestion
- **Automated**: Recursively fetches and merges fragmented CSV chunks.
- **Robust**: Handles massive datasets without data loss using Outer Joins.

---

## 🚀 Quick Start

### **Option 1: Run Analysis (Google Colab)**
To see the Feature Engineering and Model Training in action:

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1YAQ4nfxltvG_cts3fmGc_zi2JQc4oPOT?usp=sharing)

1. Open the Notebook.
2. Run all cells to process the raw data.
3. Download the generated `analyzed_aadhaar_data.csv`.

### **Option 2: Run Dashboard (Local)**

**Prerequisites:** Python 3.8+, pip

1. **Clone the repository**
   ```bash
   git clone [https://huggingface.co/spaces/lovnishverma/UIDAI](https://huggingface.co/spaces/lovnishverma/UIDAI)
   cd UIDAI

```

2. **Install dependencies**
```bash
pip install -r requirements.txt

```


3. **Launch the App**
```bash
streamlit run app.py

```


4. **Access the Dashboard**
Open `http://localhost:8501` in your browser.

---

## 📁 Project Structure

```
UIDAI/
├── README.md                                 # This documentation
├── requirements.txt                          # Python dependencies
├── Dockerfile                                # Container configuration
├── app.py                                    # Streamlit Dashboard Code
├── UIDAI_4571_(PROJECT_S_A_T_A_R_K_AI).ipynb # Main Analysis Notebook
├── analyzed_aadhaar_data.csv                 # Processed Data for Dashboard
├── Final-Project-Report.pdf                  # Complete Project Documentation
└── assets/                                   # Images and logos

```

---

## 🧠 Technical Architecture

### The Pipeline

1. **Ingestion**: `SmartLoader` class merges fragmented CSVs.
2. **Context Engine**: Calculates `ratio_deviation` (Center vs. District).
3. **AI Model**: `IsolationForest` detects statistical outliers.
4. **Visualization**: Streamlit app renders the `RISK_SCORE` on maps.

### Core Risk Signals

| Feature | Logic | Detects |
| --- | --- | --- |
| **Ratio Deviation** | `(Center_Ratio - District_Avg)` | Ghost IDs |
| **Weekend Spike** | `Activity on Sunday / Normal Day` | Illegal Camps |
| **Mismatch Score** | ` | Bio - Demo |
| **Volume Anomaly** | `Total_Activity > 99th Percentile` | Mass Operations |

---

## 📊 Dashboard Preview

### 1. Geographic Heatmap

Instantly spot high-risk clusters across India.
*(See `assets/` for screenshots)*

### 2. Priority Action List

Downloadable CSV for vigilance officers containing only the top 1% critical cases.

### 3. AI Insights Panel

"Why is this flagged?" - The AI explains its decision (e.g., *"Flagged due to 500% spike in weekend activity"*).

---

## 👥 Team UIDAI_4571

**Team Leader:** Aman Choudhary (NIELIT Ropar)

**Team Member:** Prateek Dhar Dwivedi (NIELIT Ropar)

**Mentor:** Lovnish Verma (Project Engineer, NIELIT Ropar)

**Competition:** UIDAI Hackathon 2026

**Submission Date:** January 2026

---

## 📝 License

This project is open-source under the [MIT License](https://www.google.com/search?q=LICENSE).

---

<div align="center">
<strong>Project S.A.T.A.R.K.</strong>


<em>Statistical Anomaly Tracking & Aadhaar Risk Kit</em>


Built with ❤️ for a safer, inclusive Digital India.
</div>