Maximize environment: curriculum task, metrics endpoint, 5 bug fixes, notebook fix 4890422 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add long_horizon/personalized tasks + GitHub-hosted training curves 473ab10 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Implement all audit fixes: base class, curves, README, code quality b02956e ps2181 Claude Sonnet 4.6 commited on 16 days ago
Remove show_download_button β unsupported in this Gradio version e422a3e ps2181 Claude Sonnet 4.6 commited on 16 days ago
Fix training curves: switch gr.Plot β gr.Image with PNG bytes 7f1e860 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Fix /web 404: guard matplotlib import and harden Gradio mount 707a8d9 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add Training Results tab with GRPO reward curves for all 3 agents 5a9c33c ps2181 Claude Sonnet 4.6 commited on 16 days ago
Fix pipeline UI: total regex case-insensitive, deduplicate invoice IDs aa15f22 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Wire trained LoRA agents into pipeline demo UI e2f0d06 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Auto-seed Regulator tracker on startup β pipeline demo works immediately on cold start 7fd4d28 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add Multi-Agent Pipeline tab β live 5-agent episode trace e595317 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add Generator adversarial GRPO training + /generator/score endpoint f45efdb ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add 3 novelty upgrades: predictive Regulator, compound fraud, confidence calibration 48cc8c7 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add multi-agent architecture: Regulator, biased Generator, Auditor rewards 02b8804 ps2181 Claude Sonnet 4.6 commited on 16 days ago
Add Gradio web UI mounted at /web for interactive agent testing 8afb151 ps2181 Claude Sonnet 4.6 commited on Apr 7
Add WebSocket /ws endpoint required by openenv-core GenericEnvClient 4390d4f ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: score formatting was rounding 0.9999 to 1.000 in stdout logs 8dc2806 ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: clamp all remaining hardcoded 0.0/1.0 score returns af66f63 ps2181 Claude Sonnet 4.6 commited on Apr 7
Fix: clamp all task scores to strictly open interval (0, 1) b9b7965 ps2181 Claude Sonnet 4.6 commited on Apr 7
Add adversarial, negotiate, supply_chain tasks + dynamic difficulty + richer rewards 59a05a5 ps2181 Claude Sonnet 4.6 commited on Apr 4
Fix concurrent state conflicts with session-based environment registry ca75708 ps2181 Claude Sonnet 4.6 commited on Apr 4
Add expert fraud audit task and improve inference feedback loop c0c1e0e ps2181 Claude Sonnet 4.6 commited on Apr 4