sql-query-optimizer / README.md
jaivardhan2409's picture
Upload folder using huggingface_hub
e4c32ce verified
metadata
title: SQL Query Optimizer
emoji: 🦀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
base_path: /web

SQL Query Optimizer OpenEnv

Environment Description & Motivation

This domain models a genuine, high-value task performed by data engineers and DBAs every day: reviewing and optimizing slow SQL queries. Instead of a toy environment, this is a real-world task where the agent must rewrite SQL queries to be syntactically correct and performant.

Action Space

  • rewritten_query (str): The optimized SQL query.
  • explanation (str): A brief explanation of the changes made and why they improve the query.
  • is_done (bool): Set to true if finished to submit query for final scoring.

Observation Space

  • task_id (int): The ID of the task to perform.
  • query (str): The SQL query to review and optimize.
  • schema_context (str): Database schema context (CREATE statements).
  • hint (str): Optional natural-language hints.
  • step_number (int): Current step in the episode.
  • max_steps (int): Maximum allowed steps.

Tasks

  1. fix-broken-join (Easy): Identify and repair a query with an issue such as a missing ON clause.
  2. eliminate-n-plus-one (Medium): Remove correlated subqueries and replace them with properly structured JOINs.
  3. full-optimization (Hard): Remove redundant DISTINCT clauses, avoid SELECT *, use index hints, and fix implicit type casts in a more complex query.

Setup & Testing

# Verify using openenv
openenv validate

# Local testing
uvicorn server:app --host 0.0.0.0 --port 7860

# Docker build
docker build -t sql-optimizer-env .
docker run -p 7860:7860 sql-optimizer-env

Baseline Evaluation

A provided baseline.py script replicates inference. Usage:

export OPENAI_API_KEY=sk-... 
python baseline.py