File size: 3,953 Bytes
c0cc6f7
64e5ee2
 
 
c0cc6f7
 
64e5ee2
c0cc6f7
 
 
64e5ee2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
title: AI Financial Reconciliation Engine
emoji: 🧠
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
---

# 🧠 AI Financial Reconciliation Engine

Automated Financial Auditing using Machine Learning and LLMs.

## πŸš€ Overview
The **AI Financial Reconciliation Engine** is an intelligent system designed to automate the process of matching internal accounting records (Books) with external tax filings (GST). By combining **Fuzzy Logic**, **AI Semantic Embeddings**, and **LLM reasoning**, the system identifies discrepancies, detects fraudulent anomalies, and provides natural language explanations for auditors.

## ✨ Features
- **Intelligent Matching**: Combines basic matching with Fuzzy and AI semantic analysis to reconcile records even with typos or name variations.
- **Anomaly Detection**: Uses the `IsolationForest` algorithm to detect unusual transaction patterns and high-risk invoices.
- **AI Explanations**: Integrates Mistral LLM to provide human-readable audit comments for every discrepancy.
- **Interactive Dashboard**: A professional Gradio interface with summary metrics, risk-sorted results, and CSV export.
- **Graph Fraud Network**: Visualizes circular trading and multi-hop tax siphoning fraud rings using `NetworkX` and `Matplotlib`.
- **Persistent Vector Memory**: Uses C++ compiled `FAISS` algorithms to permanently remember vendor vector embeddings.
- **Deployment Ready**: Containerized with **Docker** and hosted on **HuggingFace Spaces**.

## πŸ›  Tech Stack
- **Languages**: Python
- **AI/ML**: Scikit-Learn, Sentence-Transformers, RapidFuzz
- **Fraud Engine**: FAISS, NetworkX, Matplotlib
- **LLM**: Mistral AI API
- **Frontend**: Gradio
- **Infrastructure**: Docker, HuggingFace Spaces

## πŸ“‚ Installation (Local)
1. Clone the repository.
2. Install dependencies: `pip install -r requirements.txt`
3. Set your `MISTRAL_API_KEY` in a `.env` file.
4. Run the app: `python main.py`

### Prerequisites

- Python 3.11+
- Virtual Environment (venv)

### Setup

1. Clone the repository
2. Create and activate virtual environment:
   
```bash
python -m venv venv
source venv/bin/activate  # Linux/macOS
venv\\Scripts\\activate     # Windows
```

3. Install dependencies:
   
```bash
pip install -r requirements.txt
```

4. Configure environment variables:
   - Copy `.env.example` to `.env`
   - Add your API keys

## Usage

### Quick Start

```python
from utils import create_sample_data
from reconciliation import ReconciliationEngine
from anomaly import AnomalyDetector

# Create sample data
data = create_sample_data(num_records=100)
source_df = data['source']
target_df = data['target']

# Run reconciliation
engine = ReconciliationEngine(threshold=85.0)
result = engine.reconcile(source_df, target_df, 'VendorName', 'VendorName', 'Amount')

# Detect anomalies
detector = AnomalyDetector(contamination=0.05)
anomaly_result = detector.detect_anomalies(source_df)
```

### Web Interface

```bash
python main.py
```

Access the UI at `http://localhost:7860`

### Docker

```bash
docker build -t reconciliation-engine .
docker run -p 7860:7860 reconciliation-engine
```

## Project Structure

```
β”œβ”€β”€ sample_data/         # Live CSV data and scenarios
β”œβ”€β”€ main.py                 # Main FastAPI backend serving UI
β”œβ”€β”€ reconciliation.py      # Core reconciliation engine & FAISS Index
β”œβ”€β”€ anomaly.py             # Anomaly detection module
β”œβ”€β”€ fraud_graph.py         # NetworkX Circular Trading Detector
β”œβ”€β”€ gst_api.py             # Real-time Local Registry Gateway
β”œβ”€β”€ generate_real_data.py  # Script to generate 1800+ realistic rows
β”œβ”€β”€ llm_explainer.py       # LLM-powered explanations
β”œβ”€β”€ utils.py               # Utility functions
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ Dockerfile         # Docker configuration
β”œβ”€β”€ .env               # Environment variables
└── README.md          # This file
```