File size: 3,482 Bytes
a4e63be
 
 
 
 
 
 
 
790e0e9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e75022a
790e0e9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e75022a
 
790e0e9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e75022a
790e0e9
 
e75022a
a4e63be
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: afl-3.0
title: ChatWithData
sdk: docker
emoji: πŸš€
colorFrom: yellow
colorTo: green
---
# πŸš— Data Insights App

An AI-powered data analysis platform that allows users to explore a massive car auction dataset (558,000+ records) using natural language. The app combines **Streamlit** for the interface, **SQLite** for data management, and **OpenAI's GPT-4** for intelligent analysis and dynamic chart generation.

---
[## Hugging Face](https://huggingface.co/spaces/niddijoris/ChatWithData)

## οΏ½ Application Gallery

| Dashboard Overview | AI Chat & Analytics |
| :---: | :---: |
| ![Dashboard Overview](/screenshots/1.png) | ![AI Chat & Analytics](/screenshots/2.png) |
| **Real-time Statistics & Insights** | **Intelligent Querying & Data Analysis** |

| Dynamic Chart Generation | Safety & Logs |
| :---: | :---: |
| ![Chart Generation](/screenshots/3.png) | ![Console Logs](/screenshots/4.png) |
| **AI-driven Visualizations** | **Security Guardrails & Activity Monitoring** |

---

## 🌟 Key Features

### πŸ€– Intelligent AI Agent
- **Natural Language Querying**: Ask questions like "What is the average price of a BMW?" or "Compare prices between California and Florida".
- **Dynamic Chart Generation**: Ask for visualizations (bar, line, pie, scatter) and the AI will generate them instantly.
- **Context-Aware Support**: If the agent can't help, it offers to create a GitHub support ticket with the chat history.

### πŸ›‘οΈ Secure Data Management
- **ReadOnly Safety**: Strict SQL validation ensures only `SELECT` queries are executed. Dangerous operations (`DELETE`, `DROP`, `UPDATE`) are automatically blocked.
- **Privacy Guardrails**: The agent never communicates the full dataset, only relevant snippets (limited to 100 rows).

### πŸ“Š Business Intelligence
- **Real-time Stats**: Instantly see total inventory, average prices, and price/year ranges in the sidebar.
- **Automated Insights**: Interactive top-make comparisons and condition distribution charts.
- **Console Monitoring**: A live developer console in the sidebar shows every action the AI and database are taking.

---

## πŸš€ Quick Start

### 1. Prerequisites
- Python 3.9+
- OpenAI API Key

### 2. Setup
```bash
# Clone the repository and enter directory
cd "Capstone folder"

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt
```

### 3. Configuration
Copy `.env.example` to `.env` and fill in your keys:
```bash
cp .env.example .env
```
**Required**: `OPENAI_API_KEY`  
**Optional**: `GITHUB_TOKEN` and `GITHUB_REPO` (for support tickets)

### 4. Run the Application
Use the automated run script to ensure the correct environment is used:
```bash
chmod +x run.sh
./run.sh
```

---

## �️ Project Architecture
- **`app.py`**: Main Streamlit interface.
- **`agent/`**: AI logic and tool definitions.
- **`database/`**: Safe SQL execution and CSV-to-SQLite ingestion.
- **`support/`**: GitHub API integration for support tickets.
- **`ui/`**: Chart generation (Plotly) and styling.
- **`utils/`**: Custom Streamlit-integrated logger.

---

## πŸ›‘οΈ Security Policy
This application is designed with safety as a priority. The `SafetyValidator` provides a robust whitelist of allowed SQL operations, specifically protecting against SQL injection and unauthorized data modification.

πŸ›‘οΈ **Active Protections**: Only SELECT | All dangerous keywords blocked | Data remains secure