Spaces:

krishpotanwar
/

sql-repair-env

Sleeping

App Files Files Community

sql-repair-env / README.md

krishpotanwar

docs: shorten hf summary

5706fdf about 1 month ago

preview code

raw

history blame contribute delete

7.19 kB

	---
	title: SQL Repair Env
	emoji: 📊
	colorFrom: red
	colorTo: purple
	sdk: docker
	app_port: 8000
	pinned: false
	license: apache-2.0
	short_description: OpenEnv SQL repair tasks with a live frontend
	---

	# SQL Repair OpenEnv

	An OpenEnv environment for the Meta PyTorch x Scaler hackathon where
	agents repair broken SQL queries against a small SQLite schema.

	## Live Links

	- GitHub: [Krishpotanwar/sql-repair-env](https://github.com/Krishpotanwar/sql-repair-env)
	- Hugging Face Space: [krishpotanwar/sql-repair-env](https://huggingface.co/spaces/krishpotanwar/sql-repair-env)
	- Live app: [krishpotanwar-sql-repair-env.hf.space](https://krishpotanwar-sql-repair-env.hf.space/)

	## What This Repo Includes

	- A FastAPI OpenEnv backend with `/health`, `/tasks`, `/reset`, `/step`, `/grader`, and `/baseline`
	- `/api/*` aliases so the frontend can call the same environment through the Hugging Face Space root
	- A frontend command center inspired by the `disasterman` UI shell, adapted for SQL query repair workflows
	- Three deterministic SQL repair tasks with strict scoring in `(0, 1)`
	- A validator-safe inference runner that always emits bounded scores even on missing-key or crash paths

	## Tasks

	\| ID \| Difficulty \| What's broken \|
	\|----------\|------------\|------------------------------------------\|
	\| `task_1` \| easy \| SELECT list missing commas \|
	\| `task_2` \| medium \| JOIN references columns that don't exist \|
	\| `task_3` \| hard \| Aggregate query missing GROUP BY \|

	Each task gives the agent the schema, the broken query, the runtime error
	(if any), and a one-line hint. The agent submits a corrected query via the
	`/step` endpoint and is scored on whether the result rows match the
	canonical expected rows.

	## Architecture

	```text
	.
	├── pyproject.toml # uv project, server entry point
	├── uv.lock # uv lockfile
	├── Dockerfile # builds the FastAPI server image
	├── README.md # GitHub + HF Space landing document
	├── inference.py # AGENT — talks to the env via HTTP, calls an LLM
	├── openenv.yaml # OpenEnv metadata
	├── server/
	│ └── app.py # FastAPI env server + frontend serving
	├── frontend/
	│ ├── src/ # Winner UI shell + SQL-specific tabs
	│ └── dist/ # prebuilt frontend bundle shipped to HF
	├── sql_env/
	│ ├── env_core.py # SQLite-backed env state
	│ ├── tasks.py # Task definitions
	│ └── grader.py # Strict (0, 1) score clamping
	└── tests/
	└── test_smoke.py # Pytest smoke suite
	```

	## Frontend

	The root route `/` serves a deployed frontend command center with:

	- `QUERY LAB` for reset, submit, and grader workflows
	- `TASK ATLAS` for browsing the three SQL missions
	- `BASELINE ARENA` for broken-query score comparisons
	- `AGENT PROTOCOL` for validator and runtime notes
	- `API OPS` for backend connection and endpoint visibility

	In production, the frontend talks to the backend through `/api/*` aliases.
	That keeps the Space root usable as both a human-facing app and an API host.

	## HTTP API

	\| Method \| Path \| Body \| Returns \|
	\|--------\|-------------\|-------------------------------------------\|--------------------------------------\|
	\| GET \| `/health` \| — \| `{"status":"ok"}` \|
	\| GET \| `/tasks` \| — \| task list + metadata \|
	\| POST \| `/reset` \| `{"task_id":"task_1"}` (optional) \| observation \|
	\| POST \| `/step` \| `{"action":{"action_type":"submit_query","query":"..."}}` \| observation/reward/done \|
	\| POST \| `/grader` \| `{"task_id":"task_1"}` \| `{"score": float in (0,1)}` \|
	\| POST \| `/baseline` \| `{"tasks":[...]}` (optional) \| scores for all tasks \|

	`/reset` accepts an empty body and defaults to `task_1` — required by the
	OpenEnv validator.

	The frontend calls the equivalent `/api/health`, `/api/tasks`, `/api/reset`,
	`/api/step`, `/api/grader`, and `/api/baseline` aliases.

	## Running locally

	```bash
	# 1. Install backend dependencies
	uv sync

	# 2. Build the frontend bundle used by the HF Space
	cd frontend
	npm ci
	npm run build
	cd ..

	# 3. Start the env server
	python -m server.app # serves API + frontend at http://localhost:8000

	# 4. Optional: frontend-only dev mode
	cd frontend
	npm run dev

	# 5. Run the agent (in another terminal)
	export HF_TOKEN=<your-groq-or-openai-key>
	export API_BASE_URL=https://api.groq.com/openai/v1
	export MODEL_NAME=llama-3.3-70b-versatile
	ENV_URL=http://localhost:8000 uv run python inference.py
	```

	## Example API Flow

	```bash
	URL=https://krishpotanwar-sql-repair-env.hf.space

	curl -s "$URL/health"
	curl -s -X POST "$URL/reset" -H "Content-Type: application/json" -d '{}'
	curl -s -X POST "$URL/step" \
	-H "Content-Type: application/json" \
	-d '{"action":{"action_type":"submit_query","query":"SELECT id, name, price FROM products ORDER BY id"}}'
	curl -s -X POST "$URL/grader" -H "Content-Type: application/json" -d '{"task_id":"task_1"}'
	curl -s -X POST "$URL/baseline" -H "Content-Type: application/json" -d '{}'
	```

	## Inference Output Notes

	`inference.py` is designed to keep stdout validator-safe:

	- Every task emits exactly one `[START]` line
	- Every task emits exactly one `[END]` line with `score=` strictly in `(0, 1)`
	- On missing-key or fatal fallback paths, the score falls back to `0.5000`
	- Successful runs may also emit `[STEP]` progress lines

	Example no-key path:

	```text
	[START] task_1
	[END] task_1 \| score=0.5000 \| status=fatal_no_llm
	[START] task_2
	[END] task_2 \| score=0.5000 \| status=fatal_no_llm
	[START] task_3
	[END] task_3 \| score=0.5000 \| status=fatal_no_llm
	```

	## Environment variables

	\| Name \| Default \| Notes \|
	\|--------------------\|------------------------------------------\|---------------------------------------------\|
	\| `API_BASE_URL` \| `https://api.groq.com/openai/v1` \| Required by OpenEnv submission checklist \|
	\| `MODEL_NAME` \| `llama-3.3-70b-versatile` \| Required by OpenEnv submission checklist \|
	\| `HF_TOKEN` \| (none — must be set in HF Space Secrets) \| Required by OpenEnv submission checklist \|
	\| `LOCAL_IMAGE_NAME` \| (unset) \| If set, inference.py boots a Docker image \|
	\| `ENV_URL` \| `http://localhost:8000` \| Where the env server is reachable \|

	## Validation

	```bash
	# Phase 1 — official OpenEnv validator
	uvx --from openenv-core openenv validate .

	# Backend smoke tests
	uv run pytest tests/ -q

	# Frontend lint
	cd frontend && npm run lint
	```

	No API keys are hardcoded in this repo. The agent reads `HF_TOKEN` (with
	optional `GROQ_API_KEY` and `OPENAI_API_KEY` fallbacks) at runtime only.