Spaces:

snigenigmatic
/

sql_tutor_env

Runtime error

App Files Files Community

sql_tutor_env / README.md

snigenigmatic

Upload folder using huggingface_hub

0683cf4 verified about 1 month ago

preview code

raw

history blame contribute delete

3.47 kB

metadata

title: SQL Tutor Env
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
tags:
  - openenv
  - openenv-main
  - rl-environment
base_path: /web

SQL Tutor Environment

An OpenEnv reinforcement learning environment that trains LLM agents to identify and fix bugs in SQL queries.

Built for the Meta x Hugging Face x PyTorch India Hackathon 2026.

Task

At each episode the agent receives:

A broken SQL query with a deliberate bug
The database schema (tables and columns)
A task description of what the correct query should return

The agent must either:

Submit a fix (submit_fix) - provide a corrected SQL query
Request a hint (request_hint) - get a progressive hint (with a small reward penalty)

The episode ends when the agent submits a correct query or exhausts its 5 allowed actions.

Reward Structure

Outcome	Reward
Correct fix, no hints, first try	+1.0
Correct fix with hints / retries	+0.1 to +0.95 (scaled down)
SQL syntax error	-0.1
Wrong query (valid SQL, wrong result)	-0.05
Requesting a hint	-0.1
Max steps reached without solving	0

Challenge Types (5 built-in)

ID	Bug Type
`wrong_aggregate`	Missing `SUM()` + `GROUP BY`
`wrong_join`	`INNER JOIN` should be `LEFT JOIN`
`off_by_one_filter`	Wrong comparison operator in `WHERE`
`missing_having`	`WHERE` used instead of `HAVING` for aggregate filter
`wrong_order_limit`	`ASC` should be `DESC` for top-N query

Quick Start

from openenv.core import EnvClient
from sql_tutor_env.client import SQLTutorEnv
from sql_tutor_env.models import SQLAction

# Connect to the running HF Space
env = SQLTutorEnv(base_url="https://your-username-sql-tutor-env.hf.space")

# Start an episode
obs, state = env.reset()
print(f"Task: {obs.task_description}")
print(f"Broken query: {obs.broken_query}")

# Submit a fix
result = env.step(SQLAction(
    action_type="submit_fix",
    sql_query="SELECT customer_id, SUM(amount) AS total_amount FROM orders WHERE status = 'completed' GROUP BY customer_id ORDER BY customer_id;"
))
print(f"Correct: {result.observation.is_correct}, Reward: {result.reward}")

Integration with TRL / GRPOTrainer

from trl import GRPOTrainer, GRPOConfig
from sql_tutor_env.client import SQLTutorEnv
from sql_tutor_env.models import SQLAction

def rollout_func(prompts, env):
    obs, _ = env.reset()
    # ... build prompt from obs, call model, parse SQL, step env
    pass

env = SQLTutorEnv(base_url="https://your-space.hf.space")
trainer = GRPOTrainer(
    model=model,
    config=GRPOConfig(...),
    rollout_func=rollout_func,
    env=env,
)
trainer.train()

Project Structure

sql_tutor_env/
|-- __init__.py
|-- models.py              # SQLAction, SQLObservation, SQLState
|-- client.py              # SQLTutorEnv (EnvClient subclass)
|-- openenv.yaml
|-- pyproject.toml
|-- README.md
`-- server/
    |-- __init__.py
    |-- app.py             # FastAPI app via create_app()
    |-- sql_environment.py # Core reset/step/state logic
    |-- challenges.py      # Bank of SQL bug challenges
    |-- requirements.txt
    `-- Dockerfile