ChatWithData / README.md
niddijoris's picture
Update README.md
a4e63be verified
metadata
license: afl-3.0
title: ChatWithData
sdk: docker
emoji: πŸš€
colorFrom: yellow
colorTo: green

πŸš— Data Insights App

An AI-powered data analysis platform that allows users to explore a massive car auction dataset (558,000+ records) using natural language. The app combines Streamlit for the interface, SQLite for data management, and OpenAI's GPT-4 for intelligent analysis and dynamic chart generation.


## Hugging Face

οΏ½ Application Gallery

Dashboard Overview AI Chat & Analytics
Dashboard Overview AI Chat & Analytics
Real-time Statistics & Insights Intelligent Querying & Data Analysis
Dynamic Chart Generation Safety & Logs
Chart Generation Console Logs
AI-driven Visualizations Security Guardrails & Activity Monitoring

🌟 Key Features

πŸ€– Intelligent AI Agent

  • Natural Language Querying: Ask questions like "What is the average price of a BMW?" or "Compare prices between California and Florida".
  • Dynamic Chart Generation: Ask for visualizations (bar, line, pie, scatter) and the AI will generate them instantly.
  • Context-Aware Support: If the agent can't help, it offers to create a GitHub support ticket with the chat history.

πŸ›‘οΈ Secure Data Management

  • ReadOnly Safety: Strict SQL validation ensures only SELECT queries are executed. Dangerous operations (DELETE, DROP, UPDATE) are automatically blocked.
  • Privacy Guardrails: The agent never communicates the full dataset, only relevant snippets (limited to 100 rows).

πŸ“Š Business Intelligence

  • Real-time Stats: Instantly see total inventory, average prices, and price/year ranges in the sidebar.
  • Automated Insights: Interactive top-make comparisons and condition distribution charts.
  • Console Monitoring: A live developer console in the sidebar shows every action the AI and database are taking.

πŸš€ Quick Start

1. Prerequisites

  • Python 3.9+
  • OpenAI API Key

2. Setup

# Clone the repository and enter directory
cd "Capstone folder"

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configuration

Copy .env.example to .env and fill in your keys:

cp .env.example .env

Required: OPENAI_API_KEY
Optional: GITHUB_TOKEN and GITHUB_REPO (for support tickets)

4. Run the Application

Use the automated run script to ensure the correct environment is used:

chmod +x run.sh
./run.sh

�️ Project Architecture

  • app.py: Main Streamlit interface.
  • agent/: AI logic and tool definitions.
  • database/: Safe SQL execution and CSV-to-SQLite ingestion.
  • support/: GitHub API integration for support tickets.
  • ui/: Chart generation (Plotly) and styling.
  • utils/: Custom Streamlit-integrated logger.

πŸ›‘οΈ Security Policy

This application is designed with safety as a priority. The SafetyValidator provides a robust whitelist of allowed SQL operations, specifically protecting against SQL injection and unauthorized data modification.

πŸ›‘οΈ Active Protections: Only SELECT | All dangerous keywords blocked | Data remains secure