Melika Kheirieh commited on
Commit
602cae0
Β·
1 Parent(s): ba06dd4

docs(readme): add evolution note and improve intro narrative

Browse files
Files changed (1) hide show
  1. README.md +138 -38
README.md CHANGED
@@ -1,7 +1,18 @@
1
  # 🧩 NL2SQL Copilot
2
 
3
- A modular **Text-to-SQL Copilot** that converts natural language questions into safe and verified SQL queries.
4
- Built with **FastAPI**, **LangGraph**, and **SQLAlchemy**, designed for read-only databases and evaluation on Spider/Dr.Spider benchmarks.
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  ---
7
 
@@ -21,6 +32,52 @@ docker run --rm -p 8000:8000 nl2sql-copilot
21
  ```
22
 
23
  Then open [http://localhost:8000/docs](http://localhost:8000/docs) πŸš€
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ---
26
 
@@ -30,69 +87,112 @@ Then open [http://localhost:8000/docs](http://localhost:8000/docs) πŸš€
30
  nl2sql-copilot/
31
  β”‚
32
  β”œβ”€β”€ app/ # FastAPI app, routers, schemas
33
- β”œβ”€β”€ nl2sql/ # Core pipeline (planner β†’ generator β†’ safety β†’ executor β†’ verifier)
34
  β”œβ”€β”€ adapters/ # Database and LLM adapters
35
  β”œβ”€β”€ benchmarks/ # Evaluation scripts and results
36
- β”œβ”€β”€ ui/ # Streamlit dashboard
 
37
  β”‚
38
  β”œβ”€β”€ Dockerfile
39
- β”œβ”€β”€ requirements.in
40
- β”œβ”€β”€ requirements.txt
41
  └── README.md
42
  ```
43
 
44
  ---
45
 
46
- ## πŸ§ͺ Development
47
-
48
- ### Install dependencies
49
 
50
- (Recommended: Python 3.12+ and virtualenv)
51
 
52
- ```bash
53
- pip install -r requirements.txt
 
 
54
  ```
55
 
56
- ### Run tests
 
57
 
58
- ```bash
59
- pytest -q
60
- ```
61
 
62
- ### Lint and type-check
63
 
64
- ```bash
65
- ruff check .
66
- mypy .
67
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
  ---
70
 
71
- ## 🧠 Features
72
 
73
- * βœ… Modular multi-stage pipeline (Planner β†’ Generator β†’ Safety β†’ Executor β†’ Verifier β†’ Repair)
74
- * πŸ›‘οΈ SQL safety filters (SELECT-only, forbidden keywords)
75
- * πŸ” Self-repair loop on failed executions
76
- * πŸ“Š Streamlit benchmark dashboard (latency, accuracy, cost)
77
- * 🧩 PostgreSQL + SQLite adapters
78
- * 🧠 Powered by `pydantic-ai` and `LangGraph`
 
 
 
79
 
80
  ---
81
 
82
  ## 🧰 Tech Stack
83
 
84
- | Layer | Tools |
85
- | ---------------- | --------------------------------------- |
86
- | Backend API | FastAPI, Uvicorn |
87
- | Pipeline Core | Python 3.12, Pydantic, SQLGlot |
88
- | LLM Interface | pydantic-ai (OpenAI, Anthropic, Ollama) |
89
- | Database | SQLite (default), PostgreSQL |
90
- | Evaluation | Spider / Dr.Spider |
91
- | UI | Streamlit + Plotly |
92
- | Containerization | Docker / Docker Compose |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  ---
95
 
96
  ## πŸ“„ License
97
 
98
- MIT Β© 2025 Melika Kheirieh
 
1
  # 🧩 NL2SQL Copilot
2
 
3
+ A modular **Text-to-SQL Copilot** that converts natural-language questions into **safe, verified SQL queries**.
4
+ Built with **FastAPI**, **LangGraph**, and **SQLAlchemy**, designed for **read-only databases** and benchmarked on **Spider** and **Dr.Spider** datasets.
5
+
6
+ ---
7
+
8
+ > πŸ’‘ **Why it matters**
9
+ > In real analytics teams, analysts often need quick insights without writing SQL.
10
+ > **NL2SQL Copilot** bridges that gap by translating plain-English questions into validated, read-only SQL β€” reducing query errors and saving hours of analyst time.
11
+ >
12
+ > 🧬 **Evolution Note**
13
+ > This repository is the next-generation version of [NL2SQL Copilot Prototype](https://github.com/melika-kheirieh/nl2sql-copilot-prototype).
14
+ > It refactors the original prototype into a **production-grade, modular architecture** β€”
15
+ > adding configuration-driven pipelines, safety layers, benchmarks, and a Streamlit UI for evaluation.
16
 
17
  ---
18
 
 
32
  ```
33
 
34
  Then open [http://localhost:8000/docs](http://localhost:8000/docs) πŸš€
35
+ Or launch the [Streamlit Demo](http://localhost:7860) to test it interactively.
36
+
37
+ ---
38
+
39
+ ## 🧠 Demo
40
+
41
+ 🎯 **Live Demo:** [Try it on Hugging Face Spaces β†’](https://huggingface.co/spaces/melika-kheirieh/nl2sql-copilot)
42
+
43
+ You can ask a question in plain English β€” the Copilot plans, generates, verifies, and safely executes an SQL query.
44
+
45
+ **User Query**
46
+
47
+ > show top 5 albums by total sales
48
+
49
+ **Generated SQL**
50
+
51
+ ```sql
52
+ SELECT albums.Title, SUM(invoice_items.UnitPrice * invoice_items.Quantity) AS total_sales
53
+ FROM albums
54
+ JOIN tracks ON albums.AlbumId = tracks.AlbumId
55
+ JOIN invoice_items ON invoice_items.TrackId = tracks.TrackId
56
+ GROUP BY albums.Title
57
+ ORDER BY total_sales DESC
58
+ LIMIT 5;
59
+ ```
60
+
61
+ **Execution Result (preview)**
62
+
63
+ | Album | Total Sales |
64
+ | ----------------- | ----------- |
65
+ | Greatest Hits | 155.34 |
66
+ | Let There Be Rock | 133.09 |
67
+ | Big Ones | 128.44 |
68
+
69
+ **Trace**
70
+
71
+ ```json
72
+ [
73
+ {"stage": "planner", "duration_ms": 38, "summary": "Identified SQL intent"},
74
+ {"stage": "generator", "duration_ms": 201, "summary": "LLM generated SQL"},
75
+ {"stage": "safety", "duration_ms": 6, "summary": "Validated SELECT-only"},
76
+ {"stage": "executor", "duration_ms": 92, "summary": "Executed successfully"}
77
+ ]
78
+ ```
79
+
80
+ ![Demo Screenshot](docs/demo-screenshot.png)
81
 
82
  ---
83
 
 
87
  nl2sql-copilot/
88
  β”‚
89
  β”œβ”€β”€ app/ # FastAPI app, routers, schemas
90
+ β”œβ”€β”€ nl2sql/ # Core pipeline (Planner β†’ Generator β†’ Safety β†’ Executor β†’ Verifier)
91
  β”œβ”€β”€ adapters/ # Database and LLM adapters
92
  β”œβ”€β”€ benchmarks/ # Evaluation scripts and results
93
+ β”œβ”€β”€ ui/ # Streamlit dashboard (demo + benchmark)
94
+ β”œβ”€β”€ configs/ # Pipeline configs (YAML-based)
95
  β”‚
96
  β”œβ”€β”€ Dockerfile
97
+ β”œβ”€β”€ requirements.in / .txt
 
98
  └── README.md
99
  ```
100
 
101
  ---
102
 
103
+ ## βš™οΈ How It Works
 
 
104
 
105
+ The Copilot runs a **multi-stage pipeline** ensuring every SQL query is both correct and safe:
106
 
107
+ ```
108
+ Natural Language
109
+ ↓
110
+ [ Planner ] β†’ [ Generator (LLM) ] β†’ [ Safety ] β†’ [ Executor ] β†’ [ Verifier ] β†’ [ Repair ]
111
  ```
112
 
113
+ Each stage is modular and configurable via `configs/pipeline.yaml`.
114
+ All queries execute inside a **read-only sandbox**.
115
 
116
+ ---
 
 
117
 
118
+ ## πŸ”’ Safety Layer
119
 
120
+ Before execution, every SQL statement is validated:
121
+
122
+ | Rule | Example Blocked |
123
+ | ------------------ | ----------------------------- |
124
+ | DML not allowed | `DELETE FROM users` |
125
+ | Multi-statement | `SELECT *; DROP TABLE users` |
126
+ | Forbidden keywords | `ALTER`, `TRUNCATE`, `UPDATE` |
127
+
128
+ βœ… Only safe, single-statement `SELECT` queries are executed.
129
+
130
+ ---
131
+
132
+ ## πŸ“Š Benchmark (sample)
133
+
134
+ Evaluated on a subset of the [Spider](https://yale-lily.github.io/spider) dataset using `gpt-4o-mini`:
135
+
136
+ | Query | Type | Correct | Latency (ms) | Model |
137
+ | --------------------------- | ------------- | ------- | ------------ | ----------- |
138
+ | list all artists | simple select | βœ… | 118 | gpt-4o-mini |
139
+ | total invoices per country | aggregation | βœ… | 127 | gpt-4o-mini |
140
+ | top 3 customers by spending | aggregation | βœ… | 141 | gpt-4o-mini |
141
+ | albums released before 2000 | filter | βœ… | 122 | gpt-4o-mini |
142
+ | top 5 sales by genre | join | βœ… | 149 | gpt-4o-mini |
143
+
144
+ *(see `benchmarks/results.csv` for detailed results)*
145
 
146
  ---
147
 
148
+ ## 🧩 Key Features
149
 
150
+ * βœ… **Modular pipeline** (Planner β†’ Generator β†’ Safety β†’ Executor β†’ Verifier β†’ Repair)
151
+ * πŸ›‘οΈ **SQL safety filters** (SELECT-only, blacklist, AST validation)
152
+ * πŸ” **Self-repair loop** for failed executions
153
+ * 🧠 **LLM-driven generator** (OpenAI / Ollama / Anthropic)
154
+ * πŸ“Š **Evaluation toolkit** for latency / accuracy / cost
155
+ * βš™οΈ **Config-driven architecture** (`Pipeline.from_config("configs/pipeline.yaml")`)
156
+ * 🧰 **PostgreSQL + SQLite adapters**
157
+ * πŸŽ›οΈ **Streamlit UI** for interactive demo & benchmark
158
+ * 🧩 Built with **FastAPI**, **LangGraph**, **Pydantic-AI**, **SQLAlchemy**
159
 
160
  ---
161
 
162
  ## 🧰 Tech Stack
163
 
164
+ | Layer | Tools / Libraries |
165
+ | ------------- | ----------------------------------------- |
166
+ | Backend API | FastAPI, Uvicorn |
167
+ | Pipeline Core | Python 3.12, Pydantic, SQLGlot |
168
+ | LLM Interface | Pydantic-AI (OpenAI / Anthropic / Ollama) |
169
+ | Database | SQLite (default), PostgreSQL |
170
+ | Evaluation | Spider / Dr.Spider |
171
+ | UI | Streamlit + Plotly |
172
+ | CI/CD | GitHub Actions, Makefile, Docker |
173
+
174
+ ---
175
+
176
+ ## πŸ§ͺ Development
177
+
178
+ ```bash
179
+ pip install -r requirements.txt
180
+ pytest -q
181
+ ruff check .
182
+ mypy .
183
+ ```
184
+
185
+ ---
186
+
187
+ ## 🧭 Roadmap
188
+
189
+ * [ ] Add multilingual query support (Persian / English)
190
+ * [ ] Improve self-repair accuracy
191
+ * [ ] Add cost tracking per query
192
+ * [ ] Integrate Prometheus metrics
193
 
194
  ---
195
 
196
  ## πŸ“„ License
197
 
198
+ MIT Β© 2025 [Melika Kheirieh](https://github.com/melika-kheirieh)