File size: 1,814 Bytes
5abdb9a
e4c32ce
 
 
 
5abdb9a
e4c32ce
 
5abdb9a
e4c32ce
5abdb9a
e4c32ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: SQL Query Optimizer
emoji: 🦀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
base_path: /web
---
# SQL Query Optimizer OpenEnv

## Environment Description & Motivation
This domain models a genuine, high-value task performed by data engineers and DBAs every day: reviewing and optimizing slow SQL queries. Instead of a toy environment, this is a real-world task where the agent must rewrite SQL queries to be syntactically correct and performant. 

## Action Space
- `rewritten_query` (str): The optimized SQL query.
- `explanation` (str): A brief explanation of the changes made and why they improve the query.
- `is_done` (bool): Set to true if finished to submit query for final scoring.

## Observation Space
- `task_id` (int): The ID of the task to perform.
- `query` (str): The SQL query to review and optimize.
- `schema_context` (str): Database schema context (CREATE statements).
- `hint` (str): Optional natural-language hints.
- `step_number` (int): Current step in the episode.
- `max_steps` (int): Maximum allowed steps.

## Tasks
1. **fix-broken-join (Easy)**: Identify and repair a query with an issue such as a missing ON clause.
2. **eliminate-n-plus-one (Medium)**: Remove correlated subqueries and replace them with properly structured JOINs.
3. **full-optimization (Hard)**: Remove redundant DISTINCT clauses, avoid SELECT *, use index hints, and fix implicit type casts in a more complex query.

## Setup & Testing
```bash
# Verify using openenv
openenv validate

# Local testing
uvicorn server:app --host 0.0.0.0 --port 7860

# Docker build
docker build -t sql-optimizer-env .
docker run -p 7860:7860 sql-optimizer-env
```

## Baseline Evaluation
A provided `baseline.py` script replicates inference.
Usage:
```bash
export OPENAI_API_KEY=sk-... 
python baseline.py
```