Restructure README to required format: overview, spaces, tasks, setup, baseline f2195b2 Somuai12 commited on about 1 month ago
Fix: clamp scores to strict (0.001, 0.999) β validator rejects exact 0 and 1 95a7dc0 Somuai12 commited on Apr 10
Audit fixes: tests/ dir, clean imports, reactive corpus, README polish 70f8688 Somuai12 commited on Apr 10
Staff-Level Upgrade: Segmented Evaluation, Noise Filtering, and Task Hardening 4553b37 Somuai12 commited on Apr 10
Implement profound exploit hardening (InstructionGuard, DensityCheck, LogicalAlignment, Step-Locking) a9f749a Somuai12 commited on Apr 9
Enhance: Upgrade test suite to professional simulation showing clear reward shaping 5453275 Somuai12 commited on Apr 8
Fix grading keys mismatch: allow actual dataset metrics to be graded 184bef3 Somuai12 commited on Apr 8
Fix model discovery: skip wildcard '*' model IDs from LiteLLM proxy 9c3ced0 Somuai12 commited on Apr 7
Fix Gradio dashboard hang: restore module-level mounting (required for queue/WebSocket) 9e34c41 Somuai12 commited on Apr 7
Fix MODEL_NAME=None + Fix Gradio dashboard slowness (remove auto-reset on tab/radio) 8eede32 Somuai12 commited on Apr 7
Fix MODEL_NAME=None: auto-discover from proxy /models endpoint, fallback to gpt-4o-mini 79fb14b Somuai12 commited on Apr 7
Final Submission: Aligned ports (8000), synchronized README, and purged workspace logs/caches dd5366d Somuai12 commited on Apr 7
Critical Fix: Align internal port to 8000 to satisfy OpenEnv library requirements 47a298a Somuai12 commited on Apr 7
Compliance Fix: Resolver setup timeout with lazy Gradio and extended 120s wait 6a19dc6 Somuai12 commited on Apr 7
Fix proxy test: exit with 1 on API failure so validator sees the error; fallback to HF_TOKEN if API_KEY is empty 899c12a Somuai12 commited on Apr 7
Compliance Hardening: Remove silent fallbacks to force proxy usage 292424c Somuai12 commited on Apr 7
Compliance fix: strictly use API_KEY and API_BASE_URL to avoid proxy bypass 09a9c72 Somuai12 commited on Apr 7
Allow pip to resolve websockets by relaxing gradio and uvicorn pins 5abef36 Somuai12 commited on Apr 7
Final fix for Docker registry failures: use stable python 3.12 and pin dependencies 7c9ac02 Somuai12 commited on Apr 7
Fix structured output: ensure logging always runs and format matches validator 9cdb062 Somuai12 commited on Apr 7
Fix Docker build: use python:3.11-slim-bookworm for stable registry resolution 75c1656 Somuai12 commited on Apr 7
Fix inference.py: async OpenEnv pattern, from_docker_image, proper error handling 4c68ece Somuai12 commited on Apr 7
fix: ensure reward evolution chart has (0,0) baseline for judge visibility 8085f66 Somuai12 commited on Apr 5
fix: resolve Gradio 6.x LinePlot TypeError and constructor warnings 199d538 Somuai12 commited on Apr 5
final: comprehensive 0.9+ strategic agent upgrades and infrastructure refactor 933baa6 Somuai12 commited on Apr 5
deploy: remove binary from git history for HF compatibility, use GitHub raw URL instead c5ca7a0 Somuai12 commited on Apr 3
hackathon: final submission candidate (removes binary image for HF compatibility) 6aa8acb Somuai12 commited on Apr 3