--- license: mit base_model: Qwen/Qwen2.5-0.5B-Instruct tags: - grpo - reinforcement-learning - sql - optimization --- # GRPO SQL Optimizer Fine-tuned `Qwen/Qwen2.5-0.5B-Instruct` with GRPO reinforcement learning to optimize SQL queries using a DuckDB execution environment. ## Results - **Average eval score: 0.7550** (+12.5% above baseline) - Trained for 100 episodes on 5 SQL optimization tasks ## Blog / Writeup https://huggingface.co/spaces/laterabhi/grpo-sql-optimizer ## Training Notebook Trained on Kaggle GPU T4 x2 using GRPO with verifiable rewards.