ui: chatbot dark-neon theme β readable bubbles on Watch Agent Play d90960c Running Anurag Agarwal Cursor commited on 9 days ago
fix(ui): drop unsupported type= kwarg; dict format works by default a435a88 Anurag Agarwal Cursor commited on 9 days ago
fix(ui): Gradio chatbot format error on Watch Agent Play d63a291 Anurag Agarwal Cursor commited on 9 days ago
UI: fix '0 runs' chip + instant tab-card flip via js= callback e0be40d Anurag Agarwal commited on 21 days ago
Semantic run names: Probe/Drift/Anchor/Restrain/Champion + regen all plots 84fbeda Anurag Agarwal commited on 21 days ago
Reframe: drop Run 7 from hero, keep only where appropriate 023e210 Anurag Agarwal commited on 21 days ago
Stronger opening: meeting scheduling with 3 fabricated fields 5df7029 Anurag Agarwal commited on 21 days ago
Expand Blog.md to comprehensive deep-dive (5.9k words, all evidence) 7fbfbc0 Anurag Agarwal commited on 21 days ago
Reframe: environment is the contribution, training is validation ea8263a Anurag Agarwal commited on 22 days ago
Judge-ready polish: diagram, before/after, curated replays af3c208 Anurag Agarwal commited on 22 days ago
Fix neon CSS injection for Gradio 6.x + Blog.md at root 8fb3486 Anurag Agarwal commited on 22 days ago
Neon cyberpunk theme - dark bg, glowing accents, stat cards b4f213a Anurag Agarwal commited on 22 days ago
Fix Gradio compatibility for openenv's bundled version 712275f Anurag Agarwal commited on 22 days ago
Run 6 results + training fixes + all plots regenerated aae07d0 Anurag Agarwal commited on 22 days ago
Align eval prompts with training: add required_keys to initial context 5c18f41 Anurag Agarwal commited on 22 days ago
plots: add training progression + diagnostics, drop W&B links 099bec8 verified agarwalanu3103 commited on 22 days ago
docs: README aligned with hackathon judging criteria (Judges 60s tour + storytelling arc + plot captions) 310de9a verified agarwalanu3103 commited on 22 days ago
docs: sync README.md (slide deck + auto-validator gate update) b6de3a6 verified agarwalanu3103 commited on 22 days ago
docs: sync SUBMISSION_CHECKLIST.md (slide deck + auto-validator gate update) 753d688 verified agarwalanu3103 commited on 22 days ago
docs: sync docs/slides.md (slide deck + auto-validator gate update) 5e0e1b0 verified agarwalanu3103 commited on 22 days ago
docs: add detailed model cards for Run 1 / Run 2 / Run 4 f1678ab verified agarwalanu3103 commited on 22 days ago
Add plots/ for inline embedding in env Space README ac86191 verified agarwalanu3103 commited on 22 days ago
Sync canonical README (KL-anchor narrative, embedded plots, deliverable links) 7fe6783 verified agarwalanu3103 commited on 22 days ago
eval: enforce one-tool-call response format on every turn a22fcfd verified agarwalanu3103 commited on 22 days ago
Fix parser: handle quoted commas, balanced parens, ASK:/PROPOSE: prefixes 7c0cc92 verified agarwalanu3103 commited on 22 days ago
Eval system prompt: align character-for-character with training PROMPT β ensures trained model has zero distribution shift between train and eval d9beb62 verified agarwalanu3103 commited on 22 days ago
Eval system prompt: drop misleading software-stack example, align with training PROMPT (forces model to use task-family fields, not copy the example verbatim) ef5498c verified agarwalanu3103 commited on 22 days ago
Parser: support ASK:/PROPOSE:/Q:/PLAN: prefix forms produced by Qwen3 GRPO b8a5922 verified agarwalanu3103 commited on 22 days ago
inference: parser fix β handle key=value in func calls + balanced parens f251890 verified agarwalanu3103 commited on 22 days ago
fix(eval): pass enable_thinking=False to disable Qwen3 thinking + bump MAX_TOKENS to 800 e4d1233 verified agarwalanu3103 commited on 22 days ago
feat: add run_eval.py to Space (needed by eval_with_vllm.py for trained-model evals) 6473a24 verified agarwalanu3103 commited on 22 days ago
env: enable concurrent rollout sessions (max_concurrent_envs=8) 895f00d Anurag Agarwal commited on 22 days ago
rewrite training notebook with cleaner cell-by-cell structure a45e7e7 Anurag Agarwal commited on 22 days ago
Add training/train_grpo.ipynb β GRPO training notebook (TRL + vLLM + ClarifyEnv) 5e8f794 Anurag Agarwal commited on 22 days ago