Spaces:
Paused
Paused
File size: 5,735 Bytes
026bc6e c15d346 e08fd70 026bc6e c15d346 026bc6e c15d346 026bc6e c15d346 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | ---
title: SQL Query Optimization Env
emoji: ποΈ
colorFrom: indigo
colorTo: blue
sdk: docker
app_file: server/app.py
pinned: false
tags:
- openenv
---
# ποΈ SQL Query Optimization Environment
**OpenEnv Hackathon β Phase 1 & 2 Validated β
**
> **The only OpenEnv submission where your optimized SQL is actually executed.**
> Reward is computed from real DuckDB query timing + result correctness β not keyword matching.
---
## π What Makes This Unique
Every other environment grades agents by checking if they *mentioned* the right keywords.
This environment **actually runs both queries** against a realistic in-memory DuckDB database
(500,000 orders Β· 1,000,000 events) and measures:
| What we measure | How |
|---|---|
| ποΈ Real speedup | `original_ms / optimized_ms` via DuckDB timing |
| β
Result correctness | Both queries must return identical data |
| π Issue detection | Keyword match against ground-truth anti-patterns |
| π Analysis quality | Summary depth + improvement estimate |
The agent receives **execution feedback** after every step (`last_execution` in observation)
and can **refine its rewrite** in subsequent steps β a genuine iterative optimization loop.
---
## π¦ Environment at a Glance
| Property | Value |
|---|---|
| SQL Engine | DuckDB in-memory (real execution) |
| Tables | users (10k), orders (500k), products (1k), events (1M) |
| Tasks | 5 (easy β expert) |
| Reward | Float 0.0β1.0 (execution-grounded) |
| Max runtime | < 20 min (DuckDB warm-up ~3s, queries ~5β200ms each) |
---
## π§ Observation Space
```json
{
"task_id": "string",
"task_name": "string",
"task_description": "string",
"sql_query": "string β the bad query to optimize (executable against DuckDB)",
"schema_info": "string β table sizes, columns, indexing notes",
"dialect": "duckdb/postgresql",
"difficulty": "easy | medium | medium-hard | hard | expert",
"step_count": 0,
"max_steps": 5,
"issues_found_so_far": ["issue types flagged in previous steps"],
"last_execution": {
"original_ms": 145.7,
"optimized_ms": 9.3,
"speedup": 15.67,
"results_match": true,
"verdict": "β
15.7x faster with correct results"
}
}
```
## β‘ Action Space
```json
{
"suggestions": [
{
"issue_type": "correlated_subquery",
"line": 4,
"description": "Correlated subquery scans 500k orders for each of 3,300 premium users",
"severity": "critical",
"fix": "Rewrite as LEFT JOIN with GROUP BY aggregation"
}
],
"optimized_query": "SELECT ... FROM users u LEFT JOIN (SELECT ...) s ON ...",
"summary": "Three correlated subqueries cause ~10M row reads. Single JOIN reduces this to one 500k-row scan.",
"estimated_improvement": "15-20x faster β eliminates N+1 subquery pattern",
"approved": false
}
```
---
## π Five Tasks
| # | Task | Difficulty | Key Anti-Pattern | Expected Speedup |
|---|---|---|---|---|
| 1 | Basic Anti-pattern Detection | Easy | SELECT \*, CAST on filter, YEAR() | 2β5x |
| 2 | N+1 Correlated Subquery Elimination | Medium | 3 correlated subqueries β JOIN | 8β25x |
| 3 | Wildcard LIKE & Projection | Medium-Hard | `LIKE '%purchase%'` on 1M rows | 3β10x |
| 4 | Implicit Cross Join & Scalar Subqueries | Hard | Comma-syntax join + 2 global aggregates | 10β30x |
| 5 | Window Function Full-Scan Audit | Expert | 5 OVER() on unfiltered 1M-row table | 5β20x |
---
## π Reward Function
| Component | Weight | Measured By |
|---|---|---|
| ποΈ Real Execution Speedup | **35%** | `original_ms / optimized_ms` via DuckDB |
| β
Result Correctness | **20%** | Sorted row-set equality check |
| π Issue Detection | **25%** | Keyword match vs ground truth |
| β
Approval Correctness | **8%** | Bool match vs expected |
| π Summary Quality | **7%** | Analysis length & depth |
| π·οΈ Severity Labels | **5%** | Severity values present |
---
## π‘ API Endpoints
| Endpoint | Method | Description |
|---|---|---|
| `/` | GET | Health check + table stats |
| `/reset` | POST | Start episode (`{"task_id": "..."}`) |
| `/step` | POST | Submit action β real execution |
| `/state` | GET | Current episode state |
| `/tasks` | GET | All 5 tasks with schema |
| `/grader` | POST | Grade without advancing episode |
| `/baseline` | POST | Run inference.py |
| **`/execute`** | POST | **Run your SQL against DuckDB, get timing + verdict** |
| **`/leaderboard`** | GET | **Real-time best scores & speedups per task** |
### π₯ Try /execute right now:
```bash
curl -X POST https://laterabhi-sql-query-env.hf.space/execute \
-H "Content-Type: application/json" \
-d '{
"task_id": "task_1_basic_antipatterns",
"optimized_query": "SELECT id, customer_id, status, total FROM orders WHERE customer_id = 5000 AND created_at >= DATE '\''2024-01-01'\'' AND created_at < DATE '\''2025-01-01'\''"
}'
```
---
## π Local Setup
```bash
git clone https://github.com/OfficialAbhinavSingh/SQL-Query-Optimization-Environment-
cd SQL-Query-Optimization-Environment-
pip install -r requirements.txt
uvicorn server.app:app --host 0.0.0.0 --port 7860
```
```bash
# Run inference
export API_BASE_URL=https://router.huggingface.co/v1
export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct
export HF_TOKEN=hf_...
python inference.py
```
---
## π Baseline Scores (Qwen2.5-72B)
| Task | Score | Speedup | Correct? |
|---|---|---|---|
| Basic Anti-patterns (Easy) | ~0.82 | ~4x | β
|
| N+1 Subqueries (Medium) | ~0.71 | ~12x | β
|
| Wildcard LIKE (Medium-Hard) | ~0.60 | ~6x | β
|
| Implicit Join (Hard) | ~0.52 | ~8x | β
|
| Window Functions (Expert) | ~0.44 | ~7x | β
|
---
*Built with β€οΈ for the OpenEnv Hackathon β Phase 1 & 2 Validated*
|