Spaces:

manuelaschrittwieser
/

SQL-Assistant-Prod

Running

App Files Files Community

SQL-Assistant-Prod / README.md

manuelaschrittwieser

Update README.md

392b5ad verified about 1 month ago

preview code

raw

history blame contribute delete

4.12 kB

	---
	title: Autonomous SQL Agent
	emoji: 💬
	colorFrom: yellow
	colorTo: purple
	sdk: gradio
	sdk_version: 5.42.0
	app_file: app.py
	pinned: false
	license: mit
	short_description: 'An autonomous SQL agent, based on Qwen 2.5 (fine-tuned). '
	hf_oauth: true
	hf_oauth_scopes:
	- inference-api
	models:
	- manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant-Prod
	- Qwen/Qwen2.5-1.5B-Instruct
	tags:
	- agent
	- sql
	- text-to-sql
	- qwen
	- qlora
	---

	# Autonomous SQL Assistant Agent

	## 📋 System Overview

	The Autonomous SQL Assistant is a demonstrative AI agent designed to bridge the gap between natural language inquiries and database execution. Unlike standard "Text-to-SQL" generators that strictly output code, this agent operates within a closed-loop environment: it generates syntax, executes it against a live database, and retrieves the actual data for the user.

	The system is powered by Qwen 2.5 (1.5B), fine-tuned via QLoRA on the `b-mc2/sql-create-context` dataset to ensure high fidelity in SQL syntax generation.

	[🔗 View Source Code & Documentation](https://github.com/MANU-de/Autonomous-SQL-Agent)

	---

	## 🏗️ Technical Architecture

	The application runs on a lightweight CPU environment and consists of three core components:

	### 1. The Inference Engine
	* Model: [manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant-Prod](https://huggingface.co/manuelaschrittwieser/Qwen2.5-SQL-Assistant-Prod)
	* Optimization: The model runs in full FP32 precision (CPU optimized).
	* Role: Translates user intent (e.g., "Who earns the most?") into executable SQLite syntax, utilizing the provided schema context.

	### 2. The Execution Sandbox
	* Database: A transient SQLite instance.
	* Schema: `employees` (id, name, department, salary, hire_date).
	* Lifecycle: The database is re-instantiated upon every application restart/build to ensure a clean state for testing.

	### 3. The Agent Logic
	The `SQLAgent` class orchestrates the workflow:
	1. Ingest: Receives natural language prompt.
	2. Contextualize: Injects the `CREATE TABLE` schema into the system prompt.
	3. Generate: produces the SQL query.
	4. Act: Connects to the SQLite cursor, executes the query, and fetches results.
	5. Sanitize: Catches execution errors (e.g., syntax errors) and reports them for debugging.

	---

	## 💻 Usage Instructions

	### Interface Guide
	The interface is a chat-based UI. You act as the user querying the HR database.

	* Input: Type natural language questions regarding the `employees` table.
	* Output: The agent provides a two-part response:
	1. "Brain" (Internal Monologue): The generated SQL query.
	2. "Result" (Data): The raw tuples returned from the database.

	### Example Queries
	Try copying these prompts to test the agent's capabilities:

	\| Complexity \| Query \|
	\| :--- \| :--- \|
	\| Simple \| Show me the names of all employees in Sales. \|
	\| Conditional \| Who earns more than 60000? \|
	\| Aggregation \| Count how many employees work in the Engineering department. \|
	\| Logic \| List employees hired after 2020. \|

	---

	## ⚙️ Local Reproduction

	To run this Space locally on your machine (requires Python 3.10+):

	1. Clone the Repository:
	```bash
	git clone https://huggingface.co/spaces/manuelaschrittwieser/sql-assistant-prod
	cd sql-assistant-prod
	```

	2. Install Dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Launch Application:
	```bash
	python app.py
	```

	---

	## ⚠️ Limitations & Scope

	* Inference Latency: As this demo runs on CPU Basic hardware, generating the SQL query may take 2-10 seconds depending on server load.
	* Sandbox Restrictions: Database modifications (INSERT/DROP) are possible but will persist only until the application restarts.
	* Hallucinations: While fine-tuned, the model may occasionally generate invalid SQL for highly complex queries not covered in the training distribution.

	---

	## 📜 License

	This project is open-source and available under the MIT License.