File size: 2,876 Bytes
99aa2be 72805b8 99aa2be 72805b8 99aa2be 72805b8 99aa2be | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | ---
title: SQL Arena
emoji: 🏟️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
---
# SQL Arena - OpenEnv Environment
An interactive SQL query challenge environment where AI agents learn to write SQL
by iteratively querying databases and receiving execution feedback with partial credit scoring.
## Real-World Utility
Text-to-SQL is one of the most valuable capabilities for AI agents:
- Used by data analysts, business users, and developers daily
- Evaluates reasoning, schema understanding, and query composition
- Directly applicable to production AI assistants and copilots
## Tasks
| Task | Difficulty | Description | Max Steps |
|------|-----------|-------------|-----------|
| basic_select | Easy | SELECT, WHERE, ORDER BY | 5 |
| join_aggregate | Medium | JOINs, GROUP BY, HAVING | 7 |
| complex_analysis | Hard | CTEs, window functions | 10 |
Each difficulty has 3 unique problems with deterministic grading.
## Action Space
The agent sends a SQL query each step:
{"sql_query": "SELECT name, salary FROM employees WHERE salary > 80000"}
## Observation Space
The agent receives back:
- schema_description: Database schema text
- question: Natural language question to answer
- query_result: Result table from last query
- error_message: Error if query failed
- feedback: Scoring feedback with hints
- expected_columns: Expected column names
- attempts_remaining: Steps left
- difficulty: Task difficulty level
- task_id: Problem identifier
## Reward Function (0.0 to 1.0)
| Component | Weight | Description |
|-----------|--------|-------------|
| Execution | 0.10 | Query runs without error |
| Columns | 0.20 | Correct column names |
| Row Count | 0.20 | Correct number of rows |
| Values | 0.50 | Correct data values |
## Setup
pip install -r requirements.txt
## Run Server
uvicorn src.sql_arena.server:app --host 0.0.0.0 --port 7860
## Run Inference
set HF_TOKEN=your_token
python inference.py
## Docker
docker build -t sql-arena .
docker run -p 7860:7860 sql-arena
## Run Tests
pytest tests/ -v
## Project Structure
sql_arena/
- openenv.yaml (Environment metadata)
- Dockerfile (Container deployment)
- inference.py (Baseline inference script)
- src/sql_arena/
- models.py (Typed Pydantic models)
- environment.py (Core environment logic)
- tasks.py (9 SQL challenges)
- graders.py (Partial credit scoring)
- database.py (SQLite management)
- server.py (FastAPI server)
- tests/
- test_env.py (Test suite)
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | /reset | Start new episode |
| POST | /step | Submit SQL query |
| GET | /state | Get current state |
| GET | /tasks | List available tasks |
| WS | /ws | WebSocket sessions |
## License
MIT |