Spaces:
Sleeping
Sleeping
File size: 7,194 Bytes
b9659fa 75907ae b9659fa 0376765 b9659fa 5706fdf b9659fa 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae 269f632 75907ae | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | ---
title: SQL Repair Env
emoji: π
colorFrom: red
colorTo: purple
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0
short_description: OpenEnv SQL repair tasks with a live frontend
---
# SQL Repair OpenEnv
An OpenEnv environment for the **Meta PyTorch x Scaler hackathon** where
agents repair broken SQL queries against a small SQLite schema.
## Live Links
- GitHub: [Krishpotanwar/sql-repair-env](https://github.com/Krishpotanwar/sql-repair-env)
- Hugging Face Space: [krishpotanwar/sql-repair-env](https://huggingface.co/spaces/krishpotanwar/sql-repair-env)
- Live app: [krishpotanwar-sql-repair-env.hf.space](https://krishpotanwar-sql-repair-env.hf.space/)
## What This Repo Includes
- A FastAPI OpenEnv backend with `/health`, `/tasks`, `/reset`, `/step`, `/grader`, and `/baseline`
- `/api/*` aliases so the frontend can call the same environment through the Hugging Face Space root
- A frontend command center inspired by the `disasterman` UI shell, adapted for SQL query repair workflows
- Three deterministic SQL repair tasks with strict scoring in `(0, 1)`
- A validator-safe inference runner that always emits bounded scores even on missing-key or crash paths
## Tasks
| ID | Difficulty | What's broken |
|----------|------------|------------------------------------------|
| `task_1` | easy | SELECT list missing commas |
| `task_2` | medium | JOIN references columns that don't exist |
| `task_3` | hard | Aggregate query missing GROUP BY |
Each task gives the agent the schema, the broken query, the runtime error
(if any), and a one-line hint. The agent submits a corrected query via the
`/step` endpoint and is scored on whether the result rows match the
canonical expected rows.
## Architecture
```text
.
βββ pyproject.toml # uv project, server entry point
βββ uv.lock # uv lockfile
βββ Dockerfile # builds the FastAPI server image
βββ README.md # GitHub + HF Space landing document
βββ inference.py # AGENT β talks to the env via HTTP, calls an LLM
βββ openenv.yaml # OpenEnv metadata
βββ server/
β βββ app.py # FastAPI env server + frontend serving
βββ frontend/
β βββ src/ # Winner UI shell + SQL-specific tabs
β βββ dist/ # prebuilt frontend bundle shipped to HF
βββ sql_env/
β βββ env_core.py # SQLite-backed env state
β βββ tasks.py # Task definitions
β βββ grader.py # Strict (0, 1) score clamping
βββ tests/
βββ test_smoke.py # Pytest smoke suite
```
## Frontend
The root route `/` serves a deployed frontend command center with:
- `QUERY LAB` for reset, submit, and grader workflows
- `TASK ATLAS` for browsing the three SQL missions
- `BASELINE ARENA` for broken-query score comparisons
- `AGENT PROTOCOL` for validator and runtime notes
- `API OPS` for backend connection and endpoint visibility
In production, the frontend talks to the backend through `/api/*` aliases.
That keeps the Space root usable as both a human-facing app and an API host.
## HTTP API
| Method | Path | Body | Returns |
|--------|-------------|-------------------------------------------|--------------------------------------|
| GET | `/health` | β | `{"status":"ok"}` |
| GET | `/tasks` | β | task list + metadata |
| POST | `/reset` | `{"task_id":"task_1"}` (optional) | observation |
| POST | `/step` | `{"action":{"action_type":"submit_query","query":"..."}}` | observation/reward/done |
| POST | `/grader` | `{"task_id":"task_1"}` | `{"score": float in (0,1)}` |
| POST | `/baseline` | `{"tasks":[...]}` (optional) | scores for all tasks |
`/reset` accepts an empty body and defaults to `task_1` β required by the
OpenEnv validator.
The frontend calls the equivalent `/api/health`, `/api/tasks`, `/api/reset`,
`/api/step`, `/api/grader`, and `/api/baseline` aliases.
## Running locally
```bash
# 1. Install backend dependencies
uv sync
# 2. Build the frontend bundle used by the HF Space
cd frontend
npm ci
npm run build
cd ..
# 3. Start the env server
python -m server.app # serves API + frontend at http://localhost:8000
# 4. Optional: frontend-only dev mode
cd frontend
npm run dev
# 5. Run the agent (in another terminal)
export HF_TOKEN=<your-groq-or-openai-key>
export API_BASE_URL=https://api.groq.com/openai/v1
export MODEL_NAME=llama-3.3-70b-versatile
ENV_URL=http://localhost:8000 uv run python inference.py
```
## Example API Flow
```bash
URL=https://krishpotanwar-sql-repair-env.hf.space
curl -s "$URL/health"
curl -s -X POST "$URL/reset" -H "Content-Type: application/json" -d '{}'
curl -s -X POST "$URL/step" \
-H "Content-Type: application/json" \
-d '{"action":{"action_type":"submit_query","query":"SELECT id, name, price FROM products ORDER BY id"}}'
curl -s -X POST "$URL/grader" -H "Content-Type: application/json" -d '{"task_id":"task_1"}'
curl -s -X POST "$URL/baseline" -H "Content-Type: application/json" -d '{}'
```
## Inference Output Notes
`inference.py` is designed to keep stdout validator-safe:
- Every task emits exactly one `[START]` line
- Every task emits exactly one `[END]` line with `score=` strictly in `(0, 1)`
- On missing-key or fatal fallback paths, the score falls back to `0.5000`
- Successful runs may also emit `[STEP]` progress lines
Example no-key path:
```text
[START] task_1
[END] task_1 | score=0.5000 | status=fatal_no_llm
[START] task_2
[END] task_2 | score=0.5000 | status=fatal_no_llm
[START] task_3
[END] task_3 | score=0.5000 | status=fatal_no_llm
```
## Environment variables
| Name | Default | Notes |
|--------------------|------------------------------------------|---------------------------------------------|
| `API_BASE_URL` | `https://api.groq.com/openai/v1` | Required by OpenEnv submission checklist |
| `MODEL_NAME` | `llama-3.3-70b-versatile` | Required by OpenEnv submission checklist |
| `HF_TOKEN` | (none β must be set in HF Space Secrets) | Required by OpenEnv submission checklist |
| `LOCAL_IMAGE_NAME` | (unset) | If set, inference.py boots a Docker image |
| `ENV_URL` | `http://localhost:8000` | Where the env server is reachable |
## Validation
```bash
# Phase 1 β official OpenEnv validator
uvx --from openenv-core openenv validate .
# Backend smoke tests
uv run pytest tests/ -q
# Frontend lint
cd frontend && npm run lint
```
No API keys are hardcoded in this repo. The agent reads `HF_TOKEN` (with
optional `GROQ_API_KEY` and `OPENAI_API_KEY` fallbacks) at runtime only.
|