grpo-sql-optimizer / README.md
laterabhi's picture
Upload README.md with huggingface_hub
f4501b6 verified
metadata
license: mit
base_model: Qwen/Qwen2.5-0.5B-Instruct
tags:
  - grpo
  - reinforcement-learning
  - sql
  - optimization

GRPO SQL Optimizer

Fine-tuned Qwen/Qwen2.5-0.5B-Instruct with GRPO reinforcement learning to optimize SQL queries using a DuckDB execution environment.

Results

  • Average eval score: 0.7550 (+12.5% above baseline)
  • Trained for 100 episodes on 5 SQL optimization tasks

Blog / Writeup

https://huggingface.co/spaces/laterabhi/grpo-sql-optimizer

Training Notebook

Trained on Kaggle GPU T4 x2 using GRPO with verifiable rewards.