File size: 5,962 Bytes
eff8aa5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
title: InsightPilot - Autonomous Analytics Agent
emoji: ๐Ÿš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
license: mit
python_version: 3.10
---

# InsightPilot โ€“ Autonomous Analytics Agent

<div align="center">

[![Powered by LangGraph](https://img.shields.io/badge/Powered%20by-LangGraph-blue)](https://github.com/langchain-ai/langgraph)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-green)](https://fastapi.tiangolo.com/)
[![Groq](https://img.shields.io/badge/LLM-Groq%20Llama--3-orange)](https://groq.com/)

</div>

InsightPilot is a production-ready AI analyst that transforms natural language questions into validated SQL queries, interactive visualizations, comprehensive insights, and executive-ready PDF reports.

## ๐ŸŒŸ Features

- **๐Ÿค– Agentic LangGraph Pipeline** โ€“ Deterministic tool-calling workflow (intent โ†’ schema โ†’ NL2SQL โ†’ execution โ†’ diagnostics โ†’ visualization โ†’ PDF)
- **๐Ÿ“Š Advanced Analytics** โ€“ Automated trend detection and anomaly analysis with statistical insights
- **๐Ÿ“„ PDF Report Generation** โ€“ Executive-ready reports with branded title pages, charts, and SQL appendix
- **๐Ÿ“ Multi-table Support** โ€“ Easy CSV upload and dataset catalog management
- **โšก Real-time Streaming** โ€“ Live insights streamed to the UI as they're generated
- **๐Ÿ” Groq Llama-3 Powered** โ€“ Low-latency NLโ†’SQL and narrative insight generation

## ๐Ÿš€ Quick Start on Hugging Face Spaces

1. **Set Environment Variables** (Required)
   - Go to Settings โ†’ Repository Secrets
   - Add `GROQ_API_KEY` with your Groq API key ([Get one here](https://console.groq.com/))

2. **Upload Your Data** (Optional)
   - Use the "Upload Dataset" tab to add your CSV files
   - Or work with the pre-loaded sample sales dataset

3. **Ask Questions**
   - Use the Analytics Dashboard to ask natural language questions
   - Example: "What were the total sales by category last quarter?"
   - Get SQL, visualizations, insights, and downloadable PDF reports

## ๐Ÿ—๏ธ Architecture

| Component | Technology | Purpose |
|-----------|-----------|---------|
| **LLM Orchestration** | LangGraph + Groq Llama-3 70B | Deterministic agent workflow with tool calling |
| **API & Backend** | FastAPI + SQLAlchemy | RESTful API, database management |
| **Analytics** | Pandas, NumPy, SciPy | Trend detection, anomaly analysis |
| **Visualization** | Matplotlib, ReportLab | Charts and PDF report generation |
| **Database** | SQLite | Lightweight, persistent data storage |
| **Frontend** | React + Vite (optional) | Modern interactive dashboard |
| **Interface** | Gradio | HF Spaces integration |

## ๐Ÿ“Š Advanced Analytics Modules

- **Trend Detection**: Time series regression analysis with slope quantification and % change metrics
- **Anomaly Detection**: Z-score based statistical outlier identification
- **Insight Generation**: Context-aware narrative summaries powered by Groq LLM

## ๐Ÿ› ๏ธ Tech Stack

```
Backend:  FastAPI + LangGraph + LangChain + Groq
Data:     SQLite + SQLAlchemy + Pandas
Viz:      Matplotlib + ReportLab/Platypus
Frontend: React + Vite (embedded in Gradio)
Deploy:   Hugging Face Spaces (Gradio SDK)
```

## ๐Ÿ“ Project Structure

```
.
โ”œโ”€โ”€ app.py                      # Gradio wrapper for HF Spaces
โ”œโ”€โ”€ requirements.txt            # Python dependencies
โ”œโ”€โ”€ backend/
โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”œโ”€โ”€ main.py            # FastAPI application
โ”‚   โ”‚   โ”œโ”€โ”€ agents/graph.py    # LangGraph workflow
โ”‚   โ”‚   โ”œโ”€โ”€ api/routes.py      # API endpoints
โ”‚   โ”‚   โ”œโ”€โ”€ core/config.py     # Settings & environment
โ”‚   โ”‚   โ”œโ”€โ”€ db/database.py     # Database engine & seeding
โ”‚   โ”‚   โ””โ”€โ”€ services/          # Analytics, PDF, CSV modules
โ”‚   โ”œโ”€โ”€ static/                # Generated charts & PDFs
โ”‚   โ””โ”€โ”€ requirements.txt       # Backend-specific deps
โ”œโ”€โ”€ frontend/                  # React dashboard (optional)
โ””โ”€โ”€ data/                      # Sample datasets

```

## ๐Ÿ”‘ Environment Variables

| Variable | Description | Required |
|----------|-------------|----------|
| `GROQ_API_KEY` | Groq API key for LLM access | โœ… Yes |
| `DATABASE_URL` | Database connection string | โšช Optional (defaults to SQLite) |

## ๐Ÿ“– Usage Examples

**Question:** "What were the top 5 products by revenue last year?"

**InsightPilot will:**
1. โœ… Analyze your database schema
2. โœ… Generate optimized SQL query
3. โœ… Execute query and validate results
4. โœ… Create visualizations (bar charts, trends)
5. โœ… Perform trend & anomaly analysis
6. โœ… Generate narrative insights
7. โœ… Build downloadable PDF report

## ๐ŸŽฏ Use Cases

- **Business Analytics**: Ad-hoc reporting without SQL knowledge
- **Executive Briefings**: Automated PDF reports with insights
- **Data Exploration**: Quick analysis of uploaded CSV datasets
- **Trend Analysis**: Automated time-series analytics
- **Anomaly Detection**: Statistical outlier identification

## ๐Ÿšง Limitations & Notes

- **Free HF Spaces**: CPU-only tier; suitable for moderate traffic
- **Database**: SQLite with persistent storage (50GB limit)
- **File Cleanup**: Old PDFs/charts should be periodically removed
- **Concurrent Users**: May need rate limiting for production use

## ๐Ÿ”ฎ Future Enhancements

- Multi-tenant workspaces with authentication
- Postgres/Supabase adapter for production databases
- Real-time collaborative dashboards
- Forecast & prediction modules
- Custom visualization templates

## ๐Ÿ“ License

MIT License - see LICENSE file for details

## ๐Ÿค Contributing

Contributions welcome! Please open an issue or submit a PR.

## ๐Ÿ”— Links

- **Repository**: [GitHub](https://github.com/zenitsu0509/InsightPilot)
- **Documentation**: See original README in repo
- **Groq Platform**: [Get API Key](https://console.groq.com/)

---

**Built with โค๏ธ using LangGraph, FastAPI, and Groq**