File size: 565 Bytes
0a30d75
f4501b6
 
 
 
 
 
 
0a30d75
 
f4501b6
0a30d75
f4501b6
 
0a30d75
f4501b6
 
 
0a30d75
f4501b6
 
0a30d75
f4501b6
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: mit
base_model: Qwen/Qwen2.5-0.5B-Instruct
tags:
- grpo
- reinforcement-learning
- sql
- optimization
---

# GRPO SQL Optimizer

Fine-tuned `Qwen/Qwen2.5-0.5B-Instruct` with GRPO reinforcement learning
to optimize SQL queries using a DuckDB execution environment.

## Results
- **Average eval score: 0.7550** (+12.5% above baseline)
- Trained for 100 episodes on 5 SQL optimization tasks

## Blog / Writeup
https://huggingface.co/spaces/laterabhi/grpo-sql-optimizer

## Training Notebook
Trained on Kaggle GPU T4 x2 using GRPO with verifiable rewards.