improve: 20 tasks, richer keywords, enhanced reward/grader, bigram matching, compelling README b83c8ad hellinferno commited on Apr 11
fix: correct inference log format, align openenv.yaml task IDs, harden Dockerfile 852b5ea hellinferno Claude Sonnet 4.6 commited on Apr 10