Phase 5: training pipeline — client, GRPO trainer, eval, baselines (23 handcrafted attacks) 550a83e Jaswanth1210 Claude Sonnet 4.6 commited on 23 days ago