insightpilot / README_HF.md
Himanshu Gangwar
initial commit
eff8aa5
---
title: InsightPilot - Autonomous Analytics Agent
emoji: ๐Ÿš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
license: mit
python_version: 3.10
---
# InsightPilot โ€“ Autonomous Analytics Agent
<div align="center">
[![Powered by LangGraph](https://img.shields.io/badge/Powered%20by-LangGraph-blue)](https://github.com/langchain-ai/langgraph)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-green)](https://fastapi.tiangolo.com/)
[![Groq](https://img.shields.io/badge/LLM-Groq%20Llama--3-orange)](https://groq.com/)
</div>
InsightPilot is a production-ready AI analyst that transforms natural language questions into validated SQL queries, interactive visualizations, comprehensive insights, and executive-ready PDF reports.
## ๐ŸŒŸ Features
- **๐Ÿค– Agentic LangGraph Pipeline** โ€“ Deterministic tool-calling workflow (intent โ†’ schema โ†’ NL2SQL โ†’ execution โ†’ diagnostics โ†’ visualization โ†’ PDF)
- **๐Ÿ“Š Advanced Analytics** โ€“ Automated trend detection and anomaly analysis with statistical insights
- **๐Ÿ“„ PDF Report Generation** โ€“ Executive-ready reports with branded title pages, charts, and SQL appendix
- **๐Ÿ“ Multi-table Support** โ€“ Easy CSV upload and dataset catalog management
- **โšก Real-time Streaming** โ€“ Live insights streamed to the UI as they're generated
- **๐Ÿ” Groq Llama-3 Powered** โ€“ Low-latency NLโ†’SQL and narrative insight generation
## ๐Ÿš€ Quick Start on Hugging Face Spaces
1. **Set Environment Variables** (Required)
- Go to Settings โ†’ Repository Secrets
- Add `GROQ_API_KEY` with your Groq API key ([Get one here](https://console.groq.com/))
2. **Upload Your Data** (Optional)
- Use the "Upload Dataset" tab to add your CSV files
- Or work with the pre-loaded sample sales dataset
3. **Ask Questions**
- Use the Analytics Dashboard to ask natural language questions
- Example: "What were the total sales by category last quarter?"
- Get SQL, visualizations, insights, and downloadable PDF reports
## ๐Ÿ—๏ธ Architecture
| Component | Technology | Purpose |
|-----------|-----------|---------|
| **LLM Orchestration** | LangGraph + Groq Llama-3 70B | Deterministic agent workflow with tool calling |
| **API & Backend** | FastAPI + SQLAlchemy | RESTful API, database management |
| **Analytics** | Pandas, NumPy, SciPy | Trend detection, anomaly analysis |
| **Visualization** | Matplotlib, ReportLab | Charts and PDF report generation |
| **Database** | SQLite | Lightweight, persistent data storage |
| **Frontend** | React + Vite (optional) | Modern interactive dashboard |
| **Interface** | Gradio | HF Spaces integration |
## ๐Ÿ“Š Advanced Analytics Modules
- **Trend Detection**: Time series regression analysis with slope quantification and % change metrics
- **Anomaly Detection**: Z-score based statistical outlier identification
- **Insight Generation**: Context-aware narrative summaries powered by Groq LLM
## ๐Ÿ› ๏ธ Tech Stack
```
Backend: FastAPI + LangGraph + LangChain + Groq
Data: SQLite + SQLAlchemy + Pandas
Viz: Matplotlib + ReportLab/Platypus
Frontend: React + Vite (embedded in Gradio)
Deploy: Hugging Face Spaces (Gradio SDK)
```
## ๐Ÿ“ Project Structure
```
.
โ”œโ”€โ”€ app.py # Gradio wrapper for HF Spaces
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ”œโ”€โ”€ backend/
โ”‚ โ”œโ”€โ”€ app/
โ”‚ โ”‚ โ”œโ”€โ”€ main.py # FastAPI application
โ”‚ โ”‚ โ”œโ”€โ”€ agents/graph.py # LangGraph workflow
โ”‚ โ”‚ โ”œโ”€โ”€ api/routes.py # API endpoints
โ”‚ โ”‚ โ”œโ”€โ”€ core/config.py # Settings & environment
โ”‚ โ”‚ โ”œโ”€โ”€ db/database.py # Database engine & seeding
โ”‚ โ”‚ โ””โ”€โ”€ services/ # Analytics, PDF, CSV modules
โ”‚ โ”œโ”€โ”€ static/ # Generated charts & PDFs
โ”‚ โ””โ”€โ”€ requirements.txt # Backend-specific deps
โ”œโ”€โ”€ frontend/ # React dashboard (optional)
โ””โ”€โ”€ data/ # Sample datasets
```
## ๐Ÿ”‘ Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `GROQ_API_KEY` | Groq API key for LLM access | โœ… Yes |
| `DATABASE_URL` | Database connection string | โšช Optional (defaults to SQLite) |
## ๐Ÿ“– Usage Examples
**Question:** "What were the top 5 products by revenue last year?"
**InsightPilot will:**
1. โœ… Analyze your database schema
2. โœ… Generate optimized SQL query
3. โœ… Execute query and validate results
4. โœ… Create visualizations (bar charts, trends)
5. โœ… Perform trend & anomaly analysis
6. โœ… Generate narrative insights
7. โœ… Build downloadable PDF report
## ๐ŸŽฏ Use Cases
- **Business Analytics**: Ad-hoc reporting without SQL knowledge
- **Executive Briefings**: Automated PDF reports with insights
- **Data Exploration**: Quick analysis of uploaded CSV datasets
- **Trend Analysis**: Automated time-series analytics
- **Anomaly Detection**: Statistical outlier identification
## ๐Ÿšง Limitations & Notes
- **Free HF Spaces**: CPU-only tier; suitable for moderate traffic
- **Database**: SQLite with persistent storage (50GB limit)
- **File Cleanup**: Old PDFs/charts should be periodically removed
- **Concurrent Users**: May need rate limiting for production use
## ๐Ÿ”ฎ Future Enhancements
- Multi-tenant workspaces with authentication
- Postgres/Supabase adapter for production databases
- Real-time collaborative dashboards
- Forecast & prediction modules
- Custom visualization templates
## ๐Ÿ“ License
MIT License - see LICENSE file for details
## ๐Ÿค Contributing
Contributions welcome! Please open an issue or submit a PR.
## ๐Ÿ”— Links
- **Repository**: [GitHub](https://github.com/zenitsu0509/InsightPilot)
- **Documentation**: See original README in repo
- **Groq Platform**: [Get API Key](https://console.groq.com/)
---
**Built with โค๏ธ using LangGraph, FastAPI, and Groq**