Spaces:

snigenigmatic
/

sql_tutor_env

Runtime error

App Files Files Community

sql_tutor_env / README.md

snigenigmatic

Upload folder using huggingface_hub

0683cf4 verified about 1 month ago

preview code

raw

history blame contribute delete

3.47 kB

	---
	title: SQL Tutor Env
	colorFrom: blue
	colorTo: indigo
	sdk: docker
	pinned: false
	tags:
	- openenv
	- openenv-main
	- rl-environment
	base_path: /web
	---

	# SQL Tutor Environment

	An OpenEnv reinforcement learning environment that trains LLM agents to identify and fix bugs in SQL queries.

	Built for the [Meta x Hugging Face x PyTorch India Hackathon 2026](https://www.scaler.com/school-of-technology/meta-pytorch-hackathon).

	---

	## Task

	At each episode the agent receives:
	- A broken SQL query with a deliberate bug
	- The database schema (tables and columns)
	- A task description of what the correct query should return

	The agent must either:
	1. Submit a fix (`submit_fix`) - provide a corrected SQL query
	2. Request a hint (`request_hint`) - get a progressive hint (with a small reward penalty)

	The episode ends when the agent submits a correct query or exhausts its 5 allowed actions.

	---

	## Reward Structure

	\| Outcome \| Reward \|
	\|---\|---\|
	\| Correct fix, no hints, first try \| +1.0 \|
	\| Correct fix with hints / retries \| +0.1 to +0.95 (scaled down) \|
	\| SQL syntax error \| -0.1 \|
	\| Wrong query (valid SQL, wrong result) \| -0.05 \|
	\| Requesting a hint \| -0.1 \|
	\| Max steps reached without solving \| 0 \|

	---

	## Challenge Types (5 built-in)

	\| ID \| Bug Type \|
	\|---\|---\|
	\| `wrong_aggregate` \| Missing `SUM()` + `GROUP BY` \|
	\| `wrong_join` \| `INNER JOIN` should be `LEFT JOIN` \|
	\| `off_by_one_filter` \| Wrong comparison operator in `WHERE` \|
	\| `missing_having` \| `WHERE` used instead of `HAVING` for aggregate filter \|
	\| `wrong_order_limit` \| `ASC` should be `DESC` for top-N query \|

	---

	## Quick Start

	```python
	from openenv.core import EnvClient
	from sql_tutor_env.client import SQLTutorEnv
	from sql_tutor_env.models import SQLAction

	# Connect to the running HF Space
	env = SQLTutorEnv(base_url="https://your-username-sql-tutor-env.hf.space")

	# Start an episode
	obs, state = env.reset()
	print(f"Task: {obs.task_description}")
	print(f"Broken query: {obs.broken_query}")

	# Submit a fix
	result = env.step(SQLAction(
	action_type="submit_fix",
	sql_query="SELECT customer_id, SUM(amount) AS total_amount FROM orders WHERE status = 'completed' GROUP BY customer_id ORDER BY customer_id;"
	))
	print(f"Correct: {result.observation.is_correct}, Reward: {result.reward}")
	```

	---

	## Integration with TRL / GRPOTrainer

	```python
	from trl import GRPOTrainer, GRPOConfig
	from sql_tutor_env.client import SQLTutorEnv
	from sql_tutor_env.models import SQLAction

	def rollout_func(prompts, env):
	obs, _ = env.reset()
	# ... build prompt from obs, call model, parse SQL, step env
	pass

	env = SQLTutorEnv(base_url="https://your-space.hf.space")
	trainer = GRPOTrainer(
	model=model,
	config=GRPOConfig(...),
	rollout_func=rollout_func,
	env=env,
	)
	trainer.train()
	```

	---

	## Project Structure

	```
	sql_tutor_env/
	\|-- __init__.py
	\|-- models.py # SQLAction, SQLObservation, SQLState
	\|-- client.py # SQLTutorEnv (EnvClient subclass)
	\|-- openenv.yaml
	\|-- pyproject.toml
	\|-- README.md
	`-- server/
	\|-- __init__.py
	\|-- app.py # FastAPI app via create_app()
	\|-- sql_environment.py # Core reset/step/state logic
	\|-- challenges.py # Bank of SQL bug challenges
	\|-- requirements.txt
	`-- Dockerfile
	```