Align eval prompts with training: add required_keys to initial context 5c18f41 Anurag Agarwal commited on Apr 26
Fix parser: handle quoted commas, balanced parens, ASK:/PROPOSE: prefixes 7c0cc92 verified agarwalanu3103 commited on Apr 25
Eval system prompt: align character-for-character with training PROMPT β ensures trained model has zero distribution shift between train and eval d9beb62 verified agarwalanu3103 commited on Apr 25
Eval system prompt: drop misleading software-stack example, align with training PROMPT (forces model to use task-family fields, not copy the example verbatim) ef5498c verified agarwalanu3103 commited on Apr 25
Parser: support ASK:/PROPOSE:/Q:/PLAN: prefix forms produced by Qwen3 GRPO b8a5922 verified agarwalanu3103 commited on Apr 25
inference: parser fix β handle key=value in func calls + balanced parens f251890 verified agarwalanu3103 commited on Apr 25
fix(eval): pass enable_thinking=False to disable Qwen3 thinking + bump MAX_TOKENS to 800 e4d1233 verified agarwalanu3103 commited on Apr 25
Add training/train_grpo.ipynb β GRPO training notebook (TRL + vLLM + ClarifyEnv) 5e8f794 Anurag Agarwal commited on Apr 25