sql-query-optimizer / README.md
jaivardhan2409's picture
Upload folder using huggingface_hub
e4c32ce verified
---
title: SQL Query Optimizer
emoji: 🦀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
base_path: /web
---
# SQL Query Optimizer OpenEnv
## Environment Description & Motivation
This domain models a genuine, high-value task performed by data engineers and DBAs every day: reviewing and optimizing slow SQL queries. Instead of a toy environment, this is a real-world task where the agent must rewrite SQL queries to be syntactically correct and performant.
## Action Space
- `rewritten_query` (str): The optimized SQL query.
- `explanation` (str): A brief explanation of the changes made and why they improve the query.
- `is_done` (bool): Set to true if finished to submit query for final scoring.
## Observation Space
- `task_id` (int): The ID of the task to perform.
- `query` (str): The SQL query to review and optimize.
- `schema_context` (str): Database schema context (CREATE statements).
- `hint` (str): Optional natural-language hints.
- `step_number` (int): Current step in the episode.
- `max_steps` (int): Maximum allowed steps.
## Tasks
1. **fix-broken-join (Easy)**: Identify and repair a query with an issue such as a missing ON clause.
2. **eliminate-n-plus-one (Medium)**: Remove correlated subqueries and replace them with properly structured JOINs.
3. **full-optimization (Hard)**: Remove redundant DISTINCT clauses, avoid SELECT *, use index hints, and fix implicit type casts in a more complex query.
## Setup & Testing
```bash
# Verify using openenv
openenv validate
# Local testing
uvicorn server:app --host 0.0.0.0 --port 7860
# Docker build
docker build -t sql-optimizer-env .
docker run -p 7860:7860 sql-optimizer-env
```
## Baseline Evaluation
A provided `baseline.py` script replicates inference.
Usage:
```bash
export OPENAI_API_KEY=sk-...
python baseline.py
```