Melika Kheirieh commited on
Commit
df092a2
Β·
1 Parent(s): d4bc943

docs: add README

Browse files
Files changed (1) hide show
  1. README.md +142 -0
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # NL2SQL Copilot β€” Prototype
2
+
3
+ A minimal **Text-to-SQL Copilot** built with **LangChain + Gradio**, designed to translate natural language questions into **safe SQL** and run them on a **read-only SQLite** database.
4
+
5
+ > **Status:** Prototype (v0.1). This demonstrates structure and UX; advanced safety/verification pipelines are planned.
6
+
7
+ ---
8
+
9
+ ## ✨ Features (v0.1)
10
+ - Gradio UI for quick interactions
11
+ - Config-driven environment (dotenv)
12
+ - Pluggable LLM endpoint (proxy or direct OpenAI)
13
+ - SQLite **read-only** connection (no data mutation)
14
+
15
+ **Planned next:**
16
+ - Query planning and verification
17
+ - Safer SQL guardrails (AST / blocklist / dialect checks)
18
+ - Self-repair on failed queries
19
+ - Semantic cache and telemetry
20
+
21
+ ---
22
+
23
+ ## πŸ“‚ Project Structure
24
+ ```
25
+ nl2sql-copilot-prototype/
26
+ β”œβ”€ app.py
27
+ β”œβ”€ config.py
28
+ β”œβ”€ requirements.txt
29
+ β”œβ”€ .env.example
30
+ β”œβ”€ .gitignore
31
+ └─ README.md
32
+ ```
33
+
34
+ ---
35
+
36
+ ## βš™οΈ Requirements
37
+ - Python 3.10+
38
+ - A proxy/provider API key (OpenAI / custom proxy)
39
+ - SQLite DB file (uploaded via UI)
40
+
41
+ ---
42
+
43
+ ## πŸ” Environment Variables
44
+
45
+ Copy the example and fill your own values:
46
+
47
+ ```bash
48
+ cp .env.example .env
49
+ ```
50
+
51
+ `.env.example` (proxy-agnostic):
52
+ ```bash
53
+ # ---- LLM provider or proxy (preferred) ----
54
+ PROXY_API_KEY="your-proxy-or-provider-api-key"
55
+ PROXY_BASE_URL="https://your-proxy-or-provider-base-url/v1"
56
+
57
+ # ---- Optional direct OpenAI fallback ----
58
+ #OPENAI_API_KEY="your-openai-api-key"
59
+ #OPENAI_BASE_URL="https://api.openai.com/v1"
60
+ ```
61
+
62
+ `config.py` should select `PROXY_*` first; if empty, it falls back to `OPENAI_*`.
63
+
64
+ ---
65
+
66
+ ## πŸ§ͺ Local Quickstart
67
+
68
+ ```bash
69
+ python -m venv .venv
70
+ source .venv/bin/activate # Windows: .venv\Scripts\activate
71
+ pip install -r requirements.txt
72
+ cp .env.example .env # then edit .env and add your keys
73
+ python app.py # open the Gradio link in browser
74
+ ```
75
+
76
+ Upload a SQLite file and try a prompt like:
77
+ > β€œTop 5 customers by total orders in 2024.”
78
+
79
+ ---
80
+
81
+ ## 🧰 Safety Notes (Prototype)
82
+ - DB is opened in **read-only** mode, but you should still block multi-statement payloads and dangerous tokens (e.g., `ATTACH`, `PRAGMA`, `sqlite_master`, DDL/INSERT/UPDATE/DELETE).
83
+ - Consider an AST approach (e.g., `sqlglot`) for a stricter parse/allow-list.
84
+
85
+ ---
86
+
87
+ ## ☁️ Deploy to Hugging Face Spaces (Gradio)
88
+
89
+ ### 1) Create a new Space
90
+ - Go to Hugging Face β†’ Spaces β†’ **New Space**
91
+ - **Name:** `nl2sql-copilot-prototype`
92
+ - **Space SDK:** Gradio
93
+ - **Hardware:** CPU Basic
94
+ - **Visibility:** Public (or Private)
95
+
96
+ ### 2) Add project files
97
+ Commit/push these files to the Space repo:
98
+ - `app.py`, `config.py`, `requirements.txt`, `.env.example`, `README.md`, `.gitignore`
99
+
100
+ ### 3) Set Secrets (Variables and secrets)
101
+ In Space β†’ **Settings β†’ Variables and secrets**:
102
+ - `PROXY_API_KEY`: your real key
103
+ - `PROXY_BASE_URL`: e.g., `https://.../v1`
104
+ - (Optional) `OPENAI_API_KEY` and `OPENAI_BASE_URL`
105
+
106
+ > Do **not** commit a real `.env`. Use Space **Secrets**.
107
+
108
+ ### 4) Build & Run
109
+ - Spaces auto-install from `requirements.txt`.
110
+ - If not auto-started, set **App file: main.py**, SDK: **Gradio**, Python: **3.10+**.
111
+
112
+ ### 5) Test
113
+ - Open Space URL
114
+ - Upload a small sample SQLite DB
115
+ - Check **Logs** tab for errors
116
+
117
+ **Persistence note:** Uploads are ephemeral; include a tiny demo DB in the repo if needed.
118
+
119
+ ---
120
+
121
+ ## 🧭 Usage Tips
122
+ - Prefer concise prompts (e.g., β€œShow avg price by category for 2023”).
123
+ - If a query fails, rephrase or reduce columns.
124
+ - For bigger DBs, add a schema introspection step or a β€œDescribe tables” helper.
125
+
126
+ ---
127
+
128
+ ## πŸ›‘οΈ Security & Privacy
129
+ - Never log raw API keys.
130
+ - Keep `.env` out of Git; commit only `.env.example`.
131
+ - Enforce read-only and block multi-statement SQL.
132
+
133
+ ---
134
+
135
+ ## πŸ—ΊοΈ Roadmap
136
+ - [ ] Planner β†’ Generator β†’ Safety β†’ Executor β†’ Verifier loop
137
+ - [ ] AST-based guardrails (sqlglot)
138
+ - [ ] Self-repair on DB/SQL errors
139
+ - [ ] Semantic cache + telemetry
140
+ - [ ] Streamlit / FastAPI variants
141
+
142
+