File size: 1,288 Bytes
a8a3c90
 
 
 
 
 
 
 
 
 
 
 
 
039839b
a8a3c90
 
 
039839b
 
 
a8a3c90
 
 
 
 
 
039839b
a8a3c90
039839b
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
spec_version: 1
name: queryforge
type: space
runtime: fastapi
app: server.app:app
port: 8000

description: |
  SQL Query Debugger & Optimiser environment.

  An agent receives a broken or slow SQL query together with the schema and an
  error/performance warning. It must produce a working, optimised query.

  Tasks (6 tasks across 4 difficulty levels):
    easy   β€” fix three misspelled SQL keywords (SELECT / FROM / WHERE)
    medium β€” fix a missing JOIN condition that causes a cartesian product
    hard   β€” rewrite a correlated subquery (O(NΒ²)) as a CTE (O(N))
    expert β€” fix tie-breaking window function (2 bugs: ROW_NUMBER + ASC ordering)
    expert β€” traverse org chart with recursive CTE (2 bugs: wrong anchor + hardcoded levels)
    expert β€” fix broken window functions (3 bugs: missing PARTITION BY + tied revenues)

  Reward signal (0.0 – 1.0):
    0.00        syntax error
    0.15        syntax valid, runtime error
    0.30        executes, wrong / empty results
    0.30–0.80   partial row correctness (deterministic, DuckDB)
    0.80–1.00   correct results + AI quality score (Anthropic claude-haiku-4-5)

  Optional env var: ANTHROPIC_API_KEY (enables AI judge for scores up to 1.0;
  without it, scoring is fully deterministic and capped at 0.80)