Spaces:

rahul2124
/

sql-arena

Sleeping

App Files Files Community

sql-arena / README.md

rahul2124

Upload folder using huggingface_hub

99aa2be verified about 2 months ago

preview code

raw

history blame contribute delete

2.88 kB

	---
	title: SQL Arena
	emoji: 🏟️
	colorFrom: blue
	colorTo: purple
	sdk: docker
	pinned: false
	---

	# SQL Arena - OpenEnv Environment

	An interactive SQL query challenge environment where AI agents learn to write SQL
	by iteratively querying databases and receiving execution feedback with partial credit scoring.

	## Real-World Utility

	Text-to-SQL is one of the most valuable capabilities for AI agents:
	- Used by data analysts, business users, and developers daily
	- Evaluates reasoning, schema understanding, and query composition
	- Directly applicable to production AI assistants and copilots

	## Tasks

	\| Task \| Difficulty \| Description \| Max Steps \|
	\|------\|-----------\|-------------\|-----------\|
	\| basic_select \| Easy \| SELECT, WHERE, ORDER BY \| 5 \|
	\| join_aggregate \| Medium \| JOINs, GROUP BY, HAVING \| 7 \|
	\| complex_analysis \| Hard \| CTEs, window functions \| 10 \|

	Each difficulty has 3 unique problems with deterministic grading.

	## Action Space

	The agent sends a SQL query each step:

	{"sql_query": "SELECT name, salary FROM employees WHERE salary > 80000"}

	## Observation Space

	The agent receives back:

	- schema_description: Database schema text
	- question: Natural language question to answer
	- query_result: Result table from last query
	- error_message: Error if query failed
	- feedback: Scoring feedback with hints
	- expected_columns: Expected column names
	- attempts_remaining: Steps left
	- difficulty: Task difficulty level
	- task_id: Problem identifier

	## Reward Function (0.0 to 1.0)

	\| Component \| Weight \| Description \|
	\|-----------\|--------\|-------------\|
	\| Execution \| 0.10 \| Query runs without error \|
	\| Columns \| 0.20 \| Correct column names \|
	\| Row Count \| 0.20 \| Correct number of rows \|
	\| Values \| 0.50 \| Correct data values \|

	## Setup

	pip install -r requirements.txt

	## Run Server

	uvicorn src.sql_arena.server:app --host 0.0.0.0 --port 7860

	## Run Inference

	set HF_TOKEN=your_token
	python inference.py

	## Docker

	docker build -t sql-arena .
	docker run -p 7860:7860 sql-arena

	## Run Tests

	pytest tests/ -v

	## Project Structure

	sql_arena/
	- openenv.yaml (Environment metadata)
	- Dockerfile (Container deployment)
	- inference.py (Baseline inference script)
	- src/sql_arena/
	- models.py (Typed Pydantic models)
	- environment.py (Core environment logic)
	- tasks.py (9 SQL challenges)
	- graders.py (Partial credit scoring)
	- database.py (SQLite management)
	- server.py (FastAPI server)
	- tests/
	- test_env.py (Test suite)

	## API Endpoints

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| POST \| /reset \| Start new episode \|
	\| POST \| /step \| Submit SQL query \|
	\| GET \| /state \| Get current state \|
	\| GET \| /tasks \| List available tasks \|
	\| WS \| /ws \| WebSocket sessions \|

	## License

	MIT