|
|
--- |
|
|
title: Autonomous SQL Agent |
|
|
emoji: π¬ |
|
|
colorFrom: yellow |
|
|
colorTo: purple |
|
|
sdk: gradio |
|
|
sdk_version: 5.42.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: 'An autonomous SQL agent, based on Qwen 2.5 (fine-tuned). ' |
|
|
hf_oauth: true |
|
|
hf_oauth_scopes: |
|
|
- inference-api |
|
|
models: |
|
|
- manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant-Prod |
|
|
- Qwen/Qwen2.5-1.5B-Instruct |
|
|
tags: |
|
|
- agent |
|
|
- sql |
|
|
- text-to-sql |
|
|
- qwen |
|
|
- qlora |
|
|
--- |
|
|
|
|
|
# Autonomous SQL Assistant Agent |
|
|
|
|
|
## π System Overview |
|
|
|
|
|
The **Autonomous SQL Assistant** is a demonstrative AI agent designed to bridge the gap between natural language inquiries and database execution. Unlike standard "Text-to-SQL" generators that strictly output code, this agent operates within a closed-loop environment: it **generates** syntax, **executes** it against a live database, and **retrieves** the actual data for the user. |
|
|
|
|
|
The system is powered by **Qwen 2.5 (1.5B)**, fine-tuned via **QLoRA** on the `b-mc2/sql-create-context` dataset to ensure high fidelity in SQL syntax generation. |
|
|
|
|
|
**[π View Source Code & Documentation](https://github.com/MANU-de/Autonomous-SQL-Agent)** |
|
|
|
|
|
--- |
|
|
|
|
|
## ποΈ Technical Architecture |
|
|
|
|
|
The application runs on a lightweight CPU environment and consists of three core components: |
|
|
|
|
|
### 1. The Inference Engine |
|
|
* **Model:** [manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant-Prod](https://huggingface.co/manuelaschrittwieser/Qwen2.5-SQL-Assistant-Prod) |
|
|
* **Optimization:** The model runs in full FP32 precision (CPU optimized). |
|
|
* **Role:** Translates user intent (e.g., *"Who earns the most?"*) into executable SQLite syntax, utilizing the provided schema context. |
|
|
|
|
|
### 2. The Execution Sandbox |
|
|
* **Database:** A transient **SQLite** instance. |
|
|
* **Schema:** `employees` (id, name, department, salary, hire_date). |
|
|
* **Lifecycle:** The database is re-instantiated upon every application restart/build to ensure a clean state for testing. |
|
|
|
|
|
### 3. The Agent Logic |
|
|
The `SQLAgent` class orchestrates the workflow: |
|
|
1. **Ingest:** Receives natural language prompt. |
|
|
2. **Contextualize:** Injects the `CREATE TABLE` schema into the system prompt. |
|
|
3. **Generate:** produces the SQL query. |
|
|
4. **Act:** Connects to the SQLite cursor, executes the query, and fetches results. |
|
|
5. **Sanitize:** Catches execution errors (e.g., syntax errors) and reports them for debugging. |
|
|
|
|
|
--- |
|
|
|
|
|
## π» Usage Instructions |
|
|
|
|
|
### Interface Guide |
|
|
The interface is a chat-based UI. You act as the user querying the HR database. |
|
|
|
|
|
* **Input:** Type natural language questions regarding the `employees` table. |
|
|
* **Output:** The agent provides a two-part response: |
|
|
1. **"Brain" (Internal Monologue):** The generated SQL query. |
|
|
2. **"Result" (Data):** The raw tuples returned from the database. |
|
|
|
|
|
### Example Queries |
|
|
Try copying these prompts to test the agent's capabilities: |
|
|
|
|
|
| Complexity | Query | |
|
|
| :--- | :--- | |
|
|
| **Simple** | *Show me the names of all employees in Sales.* | |
|
|
| **Conditional** | *Who earns more than 60000?* | |
|
|
| **Aggregation** | *Count how many employees work in the Engineering department.* | |
|
|
| **Logic** | *List employees hired after 2020.* | |
|
|
|
|
|
--- |
|
|
|
|
|
## βοΈ Local Reproduction |
|
|
|
|
|
To run this Space locally on your machine (requires Python 3.10+): |
|
|
|
|
|
1. **Clone the Repository:** |
|
|
```bash |
|
|
git clone https://huggingface.co/spaces/manuelaschrittwieser/sql-assistant-prod |
|
|
cd sql-assistant-prod |
|
|
``` |
|
|
|
|
|
2. **Install Dependencies:** |
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
3. **Launch Application:** |
|
|
```bash |
|
|
python app.py |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## β οΈ Limitations & Scope |
|
|
|
|
|
* **Inference Latency:** As this demo runs on **CPU Basic** hardware, generating the SQL query may take 2-10 seconds depending on server load. |
|
|
* **Sandbox Restrictions:** Database modifications (INSERT/DROP) are possible but will persist only until the application restarts. |
|
|
* **Hallucinations:** While fine-tuned, the model may occasionally generate invalid SQL for highly complex queries not covered in the training distribution. |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
This project is open-source and available under the **MIT License**. |