Spaces:
Runtime error
Runtime error
File size: 3,465 Bytes
0683cf4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | ---
title: SQL Tutor Env
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
tags:
- openenv
- openenv-main
- rl-environment
base_path: /web
---
# SQL Tutor Environment
An **OpenEnv** reinforcement learning environment that trains LLM agents to identify and fix bugs in SQL queries.
Built for the [Meta x Hugging Face x PyTorch India Hackathon 2026](https://www.scaler.com/school-of-technology/meta-pytorch-hackathon).
---
## Task
At each episode the agent receives:
- A **broken SQL query** with a deliberate bug
- The **database schema** (tables and columns)
- A **task description** of what the correct query should return
The agent must either:
1. **Submit a fix** (`submit_fix`) - provide a corrected SQL query
2. **Request a hint** (`request_hint`) - get a progressive hint (with a small reward penalty)
The episode ends when the agent submits a correct query or exhausts its 5 allowed actions.
---
## Reward Structure
| Outcome | Reward |
|---|---|
| Correct fix, no hints, first try | **+1.0** |
| Correct fix with hints / retries | **+0.1 to +0.95** (scaled down) |
| SQL syntax error | **-0.1** |
| Wrong query (valid SQL, wrong result) | **-0.05** |
| Requesting a hint | **-0.1** |
| Max steps reached without solving | **0** |
---
## Challenge Types (5 built-in)
| ID | Bug Type |
|---|---|
| `wrong_aggregate` | Missing `SUM()` + `GROUP BY` |
| `wrong_join` | `INNER JOIN` should be `LEFT JOIN` |
| `off_by_one_filter` | Wrong comparison operator in `WHERE` |
| `missing_having` | `WHERE` used instead of `HAVING` for aggregate filter |
| `wrong_order_limit` | `ASC` should be `DESC` for top-N query |
---
## Quick Start
```python
from openenv.core import EnvClient
from sql_tutor_env.client import SQLTutorEnv
from sql_tutor_env.models import SQLAction
# Connect to the running HF Space
env = SQLTutorEnv(base_url="https://your-username-sql-tutor-env.hf.space")
# Start an episode
obs, state = env.reset()
print(f"Task: {obs.task_description}")
print(f"Broken query: {obs.broken_query}")
# Submit a fix
result = env.step(SQLAction(
action_type="submit_fix",
sql_query="SELECT customer_id, SUM(amount) AS total_amount FROM orders WHERE status = 'completed' GROUP BY customer_id ORDER BY customer_id;"
))
print(f"Correct: {result.observation.is_correct}, Reward: {result.reward}")
```
---
## Integration with TRL / GRPOTrainer
```python
from trl import GRPOTrainer, GRPOConfig
from sql_tutor_env.client import SQLTutorEnv
from sql_tutor_env.models import SQLAction
def rollout_func(prompts, env):
obs, _ = env.reset()
# ... build prompt from obs, call model, parse SQL, step env
pass
env = SQLTutorEnv(base_url="https://your-space.hf.space")
trainer = GRPOTrainer(
model=model,
config=GRPOConfig(...),
rollout_func=rollout_func,
env=env,
)
trainer.train()
```
---
## Project Structure
```
sql_tutor_env/
|-- __init__.py
|-- models.py # SQLAction, SQLObservation, SQLState
|-- client.py # SQLTutorEnv (EnvClient subclass)
|-- openenv.yaml
|-- pyproject.toml
|-- README.md
`-- server/
|-- __init__.py
|-- app.py # FastAPI app via create_app()
|-- sql_environment.py # Core reset/step/state logic
|-- challenges.py # Bank of SQL bug challenges
|-- requirements.txt
`-- Dockerfile
```
|