Commit History

Maximize environment: curriculum task, metrics endpoint, 5 bug fixes, notebook fix
4890422

ps2181 Claude Sonnet 4.6 commited on

Add long_horizon/personalized tasks + GitHub-hosted training curves
473ab10

ps2181 Claude Sonnet 4.6 commited on

Implement all audit fixes: base class, curves, README, code quality
b02956e

ps2181 Claude Sonnet 4.6 commited on

Remove show_download_button β€” unsupported in this Gradio version
e422a3e

ps2181 Claude Sonnet 4.6 commited on

Fix training curves: switch gr.Plot β†’ gr.Image with PNG bytes
7f1e860

ps2181 Claude Sonnet 4.6 commited on

Fix /web 404: guard matplotlib import and harden Gradio mount
707a8d9

ps2181 Claude Sonnet 4.6 commited on

Update training curves with real Colab data
5eea840

ps2181 Claude Sonnet 4.6 commited on

Add Training Results tab with GRPO reward curves for all 3 agents
5a9c33c

ps2181 Claude Sonnet 4.6 commited on

Overhaul UI + README for submission
ed15028

ps2181 Claude Sonnet 4.6 commited on

Fix pipeline UI: total regex case-insensitive, deduplicate invoice IDs
aa15f22

ps2181 Claude Sonnet 4.6 commited on

Wire trained LoRA agents into pipeline demo UI
e2f0d06

ps2181 Claude Sonnet 4.6 commited on

Auto-seed Regulator tracker on startup β€” pipeline demo works immediately on cold start
7fd4d28

ps2181 Claude Sonnet 4.6 commited on

Add Multi-Agent Pipeline tab β€” live 5-agent episode trace
e595317

ps2181 Claude Sonnet 4.6 commited on

Add Generator adversarial GRPO training + /generator/score endpoint
f45efdb

ps2181 Claude Sonnet 4.6 commited on

Add 3 novelty upgrades: predictive Regulator, compound fraud, confidence calibration
48cc8c7

ps2181 Claude Sonnet 4.6 commited on

Add multi-agent architecture: Regulator, biased Generator, Auditor rewards
02b8804

ps2181 Claude Sonnet 4.6 commited on

Add Gradio web UI mounted at /web for interactive agent testing
8afb151

ps2181 Claude Sonnet 4.6 commited on

Add WebSocket /ws endpoint required by openenv-core GenericEnvClient
4390d4f

ps2181 Claude Sonnet 4.6 commited on

Fix: score formatting was rounding 0.9999 to 1.000 in stdout logs
8dc2806

ps2181 Claude Sonnet 4.6 commited on

Fix: clamp all remaining hardcoded 0.0/1.0 score returns
af66f63

ps2181 Claude Sonnet 4.6 commited on

Fix: clamp all task scores to strictly open interval (0, 1)
b9b7965

ps2181 Claude Sonnet 4.6 commited on

Add adversarial, negotiate, supply_chain tasks + dynamic difficulty + richer rewards
59a05a5

ps2181 Claude Sonnet 4.6 commited on

Fix concurrent state conflicts with session-based environment registry
ca75708

ps2181 Claude Sonnet 4.6 commited on

Add expert fraud audit task and improve inference feedback loop
c0c1e0e

ps2181 Claude Sonnet 4.6 commited on

Fix openenv validate issues
56faa2e

ps2181 Claude Sonnet 4.6 commited on

Add full invoice processing pipeline environment
0bf71ce

ps2181 Claude Sonnet 4.6 commited on