Upload README.md with huggingface_hub

f4501b6 verified 14 days ago

565 Bytes

license: mit
base_model: Qwen/Qwen2.5-0.5B-Instruct
tags:
  - grpo
  - reinforcement-learning
  - sql
  - optimization

GRPO SQL Optimizer

Fine-tuned Qwen/Qwen2.5-0.5B-Instruct with GRPO reinforcement learning to optimize SQL queries using a DuckDB execution environment.

Results

Trained on Kaggle GPU T4 x2 using GRPO with verifiable rewards.