LLM-SQL — Column Reordering for Prefix Caching

When rows of a table are serialized into LLM prompts sequentially, consecutive rows that share leading column values can reuse cached prefixes. This task evolves a column-reordering strategy that maximizes prefix-cache hit rates across multiple real-world datasets without altering the underlying data.

Setup

Download the datasets (~69 MB total):
```
cd benchmarks/ADRS/llm_sql
bash download_dataset.sh
```
This downloads 5 CSV datasets into datasets/:
- movies.csv — Rotten Tomatoes movie reviews (~9 MB)
- beer.csv — Beer review dataset (~2.5 MB)
- BIRD.csv — BIRD text-to-SQL dataset (~34 MB)
- PDMX.csv — PDMX metadata dataset (~7.4 MB)
- products.csv — Amazon product catalog (~16 MB)
Set your API key:
```
export OPENAI_API_KEY=...
```

Run

From the repo root:

uv run skydiscover-run \
  benchmarks/ADRS/llm_sql/initial_program.py \
  benchmarks/ADRS/llm_sql/evaluator.py \
  -c benchmarks/ADRS/llm_sql/config.yaml \
  -s [your_algorithm] \
  -i 100

Scoring

Combined score: 0.95 * average_hit_rate + 0.05 * (12 - min(12, avg_runtime)) / 12

Hit rate (95% weight): prefix-cache hit count normalized across 5 datasets
Runtime (5% weight): wall-clock seconds for the reordering algorithm

Files

File	Description
`initial_program.py`	Baseline `Evolved` class with `reorder()` method to evolve
`evaluator.py`	Scores programs on prefix hit rate and runtime across 5 datasets
`config.yaml`	Task-specific config (LLM, evaluator timeout, system prompt)
`solver.py`	Base `Algorithm` class and greedy baseline
`utils.py`	Prefix hit count evaluation utilities
`download_dataset.sh`	Script to download required CSV datasets