Add GRPO RL training script with execution reward function 33bac25 verified olanigan commited on 14 days ago
Add SFT training script with multi-template tool formatting 90e71a6 verified olanigan commited on 14 days ago
Add README with overview, quick start, and dataset links 44fda62 verified olanigan commited on 14 days ago