Spaces:
Running
Running
Commit History
Final strict spec-compliance polish: score precision, empty rewards, updated test assertions 6284048
Fix punctuation in project title 559f355
Varun commited on
Spec-compliance overhaul: remove difficulty_multiplier, weighted blend scoring, dep_hard fix, [END] format f3fd4ef
Fix dep_hard Counter bug, add fatal error handling, update README with 14-model benchmark 3466d21
Remove rate limiter (blocks evaluator) and fix score aggregation to clamped sum 3dfb5fe
Clean README FILE 6938d9f
docs: clean up README for public hackathon submission (hide internal scoring formulas) cff7056
fix(benchmark): Hardening multi-agent environment and strict score compliance 6f95f2a
Fix HF metadata 25c2f1c
Update README.md b8fe4b4 unverified
Varun commited on