Spaces:
Sleeping
title: SQL Repair Env
emoji: π
colorFrom: red
colorTo: purple
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0
short_description: OpenEnv SQL repair tasks with a live frontend
SQL Repair OpenEnv
An OpenEnv environment for the Meta PyTorch x Scaler hackathon where agents repair broken SQL queries against a small SQLite schema.
Live Links
- GitHub: Krishpotanwar/sql-repair-env
- Hugging Face Space: krishpotanwar/sql-repair-env
- Live app: krishpotanwar-sql-repair-env.hf.space
What This Repo Includes
- A FastAPI OpenEnv backend with
/health,/tasks,/reset,/step,/grader, and/baseline /api/*aliases so the frontend can call the same environment through the Hugging Face Space root- A frontend command center inspired by the
disastermanUI shell, adapted for SQL query repair workflows - Three deterministic SQL repair tasks with strict scoring in
(0, 1) - A validator-safe inference runner that always emits bounded scores even on missing-key or crash paths
Tasks
| ID | Difficulty | What's broken |
|---|---|---|
task_1 |
easy | SELECT list missing commas |
task_2 |
medium | JOIN references columns that don't exist |
task_3 |
hard | Aggregate query missing GROUP BY |
Each task gives the agent the schema, the broken query, the runtime error
(if any), and a one-line hint. The agent submits a corrected query via the
/step endpoint and is scored on whether the result rows match the
canonical expected rows.
Architecture
.
βββ pyproject.toml # uv project, server entry point
βββ uv.lock # uv lockfile
βββ Dockerfile # builds the FastAPI server image
βββ README.md # GitHub + HF Space landing document
βββ inference.py # AGENT β talks to the env via HTTP, calls an LLM
βββ openenv.yaml # OpenEnv metadata
βββ server/
β βββ app.py # FastAPI env server + frontend serving
βββ frontend/
β βββ src/ # Winner UI shell + SQL-specific tabs
β βββ dist/ # prebuilt frontend bundle shipped to HF
βββ sql_env/
β βββ env_core.py # SQLite-backed env state
β βββ tasks.py # Task definitions
β βββ grader.py # Strict (0, 1) score clamping
βββ tests/
βββ test_smoke.py # Pytest smoke suite
Frontend
The root route / serves a deployed frontend command center with:
QUERY LABfor reset, submit, and grader workflowsTASK ATLASfor browsing the three SQL missionsBASELINE ARENAfor broken-query score comparisonsAGENT PROTOCOLfor validator and runtime notesAPI OPSfor backend connection and endpoint visibility
In production, the frontend talks to the backend through /api/* aliases.
That keeps the Space root usable as both a human-facing app and an API host.
HTTP API
| Method | Path | Body | Returns |
|---|---|---|---|
| GET | /health |
β | {"status":"ok"} |
| GET | /tasks |
β | task list + metadata |
| POST | /reset |
{"task_id":"task_1"} (optional) |
observation |
| POST | /step |
{"action":{"action_type":"submit_query","query":"..."}} |
observation/reward/done |
| POST | /grader |
{"task_id":"task_1"} |
{"score": float in (0,1)} |
| POST | /baseline |
{"tasks":[...]} (optional) |
scores for all tasks |
/reset accepts an empty body and defaults to task_1 β required by the
OpenEnv validator.
The frontend calls the equivalent /api/health, /api/tasks, /api/reset,
/api/step, /api/grader, and /api/baseline aliases.
Running locally
# 1. Install backend dependencies
uv sync
# 2. Build the frontend bundle used by the HF Space
cd frontend
npm ci
npm run build
cd ..
# 3. Start the env server
python -m server.app # serves API + frontend at http://localhost:8000
# 4. Optional: frontend-only dev mode
cd frontend
npm run dev
# 5. Run the agent (in another terminal)
export HF_TOKEN=<your-groq-or-openai-key>
export API_BASE_URL=https://api.groq.com/openai/v1
export MODEL_NAME=llama-3.3-70b-versatile
ENV_URL=http://localhost:8000 uv run python inference.py
Example API Flow
URL=https://krishpotanwar-sql-repair-env.hf.space
curl -s "$URL/health"
curl -s -X POST "$URL/reset" -H "Content-Type: application/json" -d '{}'
curl -s -X POST "$URL/step" \
-H "Content-Type: application/json" \
-d '{"action":{"action_type":"submit_query","query":"SELECT id, name, price FROM products ORDER BY id"}}'
curl -s -X POST "$URL/grader" -H "Content-Type: application/json" -d '{"task_id":"task_1"}'
curl -s -X POST "$URL/baseline" -H "Content-Type: application/json" -d '{}'
Inference Output Notes
inference.py is designed to keep stdout validator-safe:
- Every task emits exactly one
[START]line - Every task emits exactly one
[END]line withscore=strictly in(0, 1) - On missing-key or fatal fallback paths, the score falls back to
0.5000 - Successful runs may also emit
[STEP]progress lines
Example no-key path:
[START] task_1
[END] task_1 | score=0.5000 | status=fatal_no_llm
[START] task_2
[END] task_2 | score=0.5000 | status=fatal_no_llm
[START] task_3
[END] task_3 | score=0.5000 | status=fatal_no_llm
Environment variables
| Name | Default | Notes |
|---|---|---|
API_BASE_URL |
https://api.groq.com/openai/v1 |
Required by OpenEnv submission checklist |
MODEL_NAME |
llama-3.3-70b-versatile |
Required by OpenEnv submission checklist |
HF_TOKEN |
(none β must be set in HF Space Secrets) | Required by OpenEnv submission checklist |
LOCAL_IMAGE_NAME |
(unset) | If set, inference.py boots a Docker image |
ENV_URL |
http://localhost:8000 |
Where the env server is reachable |
Validation
# Phase 1 β official OpenEnv validator
uvx --from openenv-core openenv validate .
# Backend smoke tests
uv run pytest tests/ -q
# Frontend lint
cd frontend && npm run lint
No API keys are hardcoded in this repo. The agent reads HF_TOKEN (with
optional GROQ_API_KEY and OPENAI_API_KEY fallbacks) at runtime only.