JustinTX
/

sky2

Model card Files Files and versions

sky2 / benchmarks /ADRS /llm_sql /README.md

JustinTX's picture

Add files using upload-large-folder tool

b0e88cf verified 28 days ago

|

history blame contribute delete

1.87 kB

	# LLM-SQL — Column Reordering for Prefix Caching

	When rows of a table are serialized into LLM prompts sequentially, consecutive rows that share leading column values can reuse cached prefixes. This task evolves a column-reordering strategy that maximizes prefix-cache hit rates across multiple real-world datasets without altering the underlying data.

	## Setup

	1. Download the datasets (~69 MB total):

	```bash
	cd benchmarks/ADRS/llm_sql
	bash download_dataset.sh
	```

	This downloads 5 CSV datasets into `datasets/`:
	- `movies.csv` — Rotten Tomatoes movie reviews (~9 MB)
	- `beer.csv` — Beer review dataset (~2.5 MB)
	- `BIRD.csv` — BIRD text-to-SQL dataset (~34 MB)
	- `PDMX.csv` — PDMX metadata dataset (~7.4 MB)
	- `products.csv` — Amazon product catalog (~16 MB)

	2. Set your API key:

	```bash
	export OPENAI_API_KEY=...
	```

	## Run

	From the repo root:

	```bash
	uv run skydiscover-run \
	benchmarks/ADRS/llm_sql/initial_program.py \
	benchmarks/ADRS/llm_sql/evaluator.py \
	-c benchmarks/ADRS/llm_sql/config.yaml \
	-s [your_algorithm] \
	-i 100
	```

	## Scoring

	Combined score: `0.95 * average_hit_rate + 0.05 * (12 - min(12, avg_runtime)) / 12`

	- Hit rate (95% weight): prefix-cache hit count normalized across 5 datasets
	- Runtime (5% weight): wall-clock seconds for the reordering algorithm

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `initial_program.py` \| Baseline `Evolved` class with `reorder()` method to evolve \|
	\| `evaluator.py` \| Scores programs on prefix hit rate and runtime across 5 datasets \|
	\| `config.yaml` \| Task-specific config (LLM, evaluator timeout, system prompt) \|
	\| `solver.py` \| Base `Algorithm` class and greedy baseline \|
	\| `utils.py` \| Prefix hit count evaluation utilities \|
	\| `download_dataset.sh` \| Script to download required CSV datasets \|