Maximize environment: curriculum task, metrics endpoint, 5 bug fixes, notebook fix 4890422 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add long_horizon/personalized tasks + GitHub-hosted training curves 473ab10 ps2181 Claude Sonnet 4.6 commited on Apr 25
Implement all audit fixes: base class, curves, README, code quality b02956e ps2181 Claude Sonnet 4.6 commited on Apr 25
Remove show_download_button — unsupported in this Gradio version e422a3e ps2181 Claude Sonnet 4.6 commited on Apr 25
Fix training curves: switch gr.Plot → gr.Image with PNG bytes 7f1e860 ps2181 Claude Sonnet 4.6 commited on Apr 25
Fix /web 404: guard matplotlib import and harden Gradio mount 707a8d9 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add Training Results tab with GRPO reward curves for all 3 agents 5a9c33c ps2181 Claude Sonnet 4.6 commited on Apr 25
Fix pipeline UI: total regex case-insensitive, deduplicate invoice IDs aa15f22 ps2181 Claude Sonnet 4.6 commited on Apr 25
Auto-seed Regulator tracker on startup — pipeline demo works immediately on cold start 7fd4d28 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add Multi-Agent Pipeline tab — live 5-agent episode trace e595317 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add Generator adversarial GRPO training + /generator/score endpoint f45efdb ps2181 Claude Sonnet 4.6 commited on Apr 25
Add 3 novelty upgrades: predictive Regulator, compound fraud, confidence calibration 48cc8c7 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add multi-agent architecture: Regulator, biased Generator, Auditor rewards 02b8804 ps2181 Claude Sonnet 4.6 commited on Apr 25
Add Gradio web UI mounted at /web for interactive agent testing 8afb151 ps2181 Claude Sonnet 4.6 commited on Apr 7
Add WebSocket /ws endpoint required by openenv-core GenericEnvClient 4390d4f ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: score formatting was rounding 0.9999 to 1.000 in stdout logs 8dc2806 ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: clamp all remaining hardcoded 0.0/1.0 score returns af66f63 ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: clamp all task scores to strictly open interval (0, 1) b9b7965 ps2181 Claude Sonnet 4.6 commited on Apr 7
Add adversarial, negotiate, supply_chain tasks + dynamic difficulty + richer rewards 59a05a5 ps2181 Claude Sonnet 4.6 commited on Apr 4
Fix concurrent state conflicts with session-based environment registry ca75708 ps2181 Claude Sonnet 4.6 commited on Apr 4
Add expert fraud audit task and improve inference feedback loop c0c1e0e ps2181 Claude Sonnet 4.6 commited on Apr 4