sql-repair-env / README.md
krishpotanwar's picture
docs: shorten hf summary
5706fdf
metadata
title: SQL Repair Env
emoji: πŸ“Š
colorFrom: red
colorTo: purple
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0
short_description: OpenEnv SQL repair tasks with a live frontend

SQL Repair OpenEnv

An OpenEnv environment for the Meta PyTorch x Scaler hackathon where agents repair broken SQL queries against a small SQLite schema.

Live Links

What This Repo Includes

  • A FastAPI OpenEnv backend with /health, /tasks, /reset, /step, /grader, and /baseline
  • /api/* aliases so the frontend can call the same environment through the Hugging Face Space root
  • A frontend command center inspired by the disasterman UI shell, adapted for SQL query repair workflows
  • Three deterministic SQL repair tasks with strict scoring in (0, 1)
  • A validator-safe inference runner that always emits bounded scores even on missing-key or crash paths

Tasks

ID Difficulty What's broken
task_1 easy SELECT list missing commas
task_2 medium JOIN references columns that don't exist
task_3 hard Aggregate query missing GROUP BY

Each task gives the agent the schema, the broken query, the runtime error (if any), and a one-line hint. The agent submits a corrected query via the /step endpoint and is scored on whether the result rows match the canonical expected rows.

Architecture

.
β”œβ”€β”€ pyproject.toml         # uv project, server entry point
β”œβ”€β”€ uv.lock                # uv lockfile
β”œβ”€β”€ Dockerfile             # builds the FastAPI server image
β”œβ”€β”€ README.md              # GitHub + HF Space landing document
β”œβ”€β”€ inference.py           # AGENT β€” talks to the env via HTTP, calls an LLM
β”œβ”€β”€ openenv.yaml           # OpenEnv metadata
β”œβ”€β”€ server/
β”‚   └── app.py             # FastAPI env server + frontend serving
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/               # Winner UI shell + SQL-specific tabs
β”‚   └── dist/              # prebuilt frontend bundle shipped to HF
β”œβ”€β”€ sql_env/
β”‚   β”œβ”€β”€ env_core.py        # SQLite-backed env state
β”‚   β”œβ”€β”€ tasks.py           # Task definitions
β”‚   └── grader.py          # Strict (0, 1) score clamping
└── tests/
    └── test_smoke.py      # Pytest smoke suite

Frontend

The root route / serves a deployed frontend command center with:

  • QUERY LAB for reset, submit, and grader workflows
  • TASK ATLAS for browsing the three SQL missions
  • BASELINE ARENA for broken-query score comparisons
  • AGENT PROTOCOL for validator and runtime notes
  • API OPS for backend connection and endpoint visibility

In production, the frontend talks to the backend through /api/* aliases. That keeps the Space root usable as both a human-facing app and an API host.

HTTP API

Method Path Body Returns
GET /health β€” {"status":"ok"}
GET /tasks β€” task list + metadata
POST /reset {"task_id":"task_1"} (optional) observation
POST /step {"action":{"action_type":"submit_query","query":"..."}} observation/reward/done
POST /grader {"task_id":"task_1"} {"score": float in (0,1)}
POST /baseline {"tasks":[...]} (optional) scores for all tasks

/reset accepts an empty body and defaults to task_1 β€” required by the OpenEnv validator.

The frontend calls the equivalent /api/health, /api/tasks, /api/reset, /api/step, /api/grader, and /api/baseline aliases.

Running locally

# 1. Install backend dependencies
uv sync

# 2. Build the frontend bundle used by the HF Space
cd frontend
npm ci
npm run build
cd ..

# 3. Start the env server
python -m server.app          # serves API + frontend at http://localhost:8000

# 4. Optional: frontend-only dev mode
cd frontend
npm run dev

# 5. Run the agent (in another terminal)
export HF_TOKEN=<your-groq-or-openai-key>
export API_BASE_URL=https://api.groq.com/openai/v1
export MODEL_NAME=llama-3.3-70b-versatile
ENV_URL=http://localhost:8000 uv run python inference.py

Example API Flow

URL=https://krishpotanwar-sql-repair-env.hf.space

curl -s "$URL/health"
curl -s -X POST "$URL/reset" -H "Content-Type: application/json" -d '{}'
curl -s -X POST "$URL/step" \
  -H "Content-Type: application/json" \
  -d '{"action":{"action_type":"submit_query","query":"SELECT id, name, price FROM products ORDER BY id"}}'
curl -s -X POST "$URL/grader" -H "Content-Type: application/json" -d '{"task_id":"task_1"}'
curl -s -X POST "$URL/baseline" -H "Content-Type: application/json" -d '{}'

Inference Output Notes

inference.py is designed to keep stdout validator-safe:

  • Every task emits exactly one [START] line
  • Every task emits exactly one [END] line with score= strictly in (0, 1)
  • On missing-key or fatal fallback paths, the score falls back to 0.5000
  • Successful runs may also emit [STEP] progress lines

Example no-key path:

[START] task_1
[END] task_1 | score=0.5000 | status=fatal_no_llm
[START] task_2
[END] task_2 | score=0.5000 | status=fatal_no_llm
[START] task_3
[END] task_3 | score=0.5000 | status=fatal_no_llm

Environment variables

Name Default Notes
API_BASE_URL https://api.groq.com/openai/v1 Required by OpenEnv submission checklist
MODEL_NAME llama-3.3-70b-versatile Required by OpenEnv submission checklist
HF_TOKEN (none β€” must be set in HF Space Secrets) Required by OpenEnv submission checklist
LOCAL_IMAGE_NAME (unset) If set, inference.py boots a Docker image
ENV_URL http://localhost:8000 Where the env server is reachable

Validation

# Phase 1 β€” official OpenEnv validator
uvx --from openenv-core openenv validate .

# Backend smoke tests
uv run pytest tests/ -q

# Frontend lint
cd frontend && npm run lint

No API keys are hardcoded in this repo. The agent reads HF_TOKEN (with optional GROQ_API_KEY and OPENAI_API_KEY fallbacks) at runtime only.