Fix campaign IDs: load dynamically from env instead of hardcoded values e8094c5 Soham Banerjee commited on Apr 8
Update README path to server/app.py structure and dependencies c769b2e Soham Banerjee commited on Apr 8
Merge branch 'main' of https://github.com/oki-dokii/Meta into bruh ef72aeb Soham Banerjee commited on Apr 8
Fix validate-submission issues: pyproject.toml setup, server entrypoint, app path, and uv.lock 29ae803 Soham Banerjee commited on Apr 8
fix: Appended the required score=<score> to [END] stdout logs for OpenEnv compatibility b377684 Soham Banerjee commited on Apr 8
chore: strict compliance with OpenEnv inference env variables ast-validation de96010 Soham Banerjee commited on Apr 8
docs: Fix penalties header to remove negative reward reference 6e93689 Soham Banerjee commited on Apr 8
docs: Update reward constraints copy to 0.0-1.0 in UI and README e486f66 Soham Banerjee commited on Apr 8
Fix page unresponsiveness by removing demo.load() events e18dfa2 Jashandeep Singh Copilot commited on Apr 7
Fix event handlers by moving functions to module level 783ba73 Jashandeep Singh Copilot commited on Apr 7
Simplify theme/CSS for Gradio 6.0 and improve performance a617197 Jashandeep Singh Copilot commited on Apr 7
Fix Gradio theme/css initialization to make buttons work bb0d19d Jashandeep Singh Copilot commited on Apr 7
Finalise OpenEnv submission: Clamp rewards to 0.0-1.0, update Gradio UI, and add Groq pipelines 192db9d Soham Banerjee commited on Apr 7
inference.py: Groq default + dynamic scenario loading + campaign/adversarial prompt hints 2421327 Soham Banerjee commited on Apr 5
README: add judge tip for deterministic reset(campaign_id=...) under campaign section 64f2c91 Soham Banerjee commited on Apr 5
Restore moderation_benchmark.json from d741d4b (128 scenarios, 100/100 checks) b2860e4 Soham Banerjee commited on Apr 5
Merge + 3 fixes: README accurate (128 scen / real baselines), is_adversarial in state, reset(campaign_id) (100/100 checks) fa17b3c Soham Banerjee commited on Apr 5
3 fixes: README accurate, is_adversarial in state, reset(campaign_id) (100/100) 10c3c6e Soham Banerjee commited on Apr 5
Appeal mechanic: is_adversarial + env.appeal() 2-turn flow (92/92 checks) d741d4b Soham Banerjee commited on Apr 4
Fill easy GT gaps: full label×action coverage (79/79 checks) a4c538a Soham Banerjee commited on Apr 4
Graduated severity penalty: sev-5→-0.30, sev-4→-0.15, sev-3→-0.05 (66/66) 94717ed Soham Banerjee commited on Apr 4
Cross-post campaign mechanic: campaign_id in state, +0.15 escalate-all bonus (61/61 checks) 748cef6 Soham Banerjee commited on Apr 4
10 ambiguous hard scenarios + full valid_actions test suite (53/53 checks) 941d83d Soham Banerjee commited on Apr 4
v2 docs & validation: README rewrite, openenv.yaml v2.0, validator 47/47 68d61d8 Soham Banerjee commited on Apr 4
Update inference.py: expand TASKS to all 75 scenarios (25 easy / 20 medium / 30 hard) 2426958 Soham Banerjee commited on Apr 4