Commit History
Remove duplicate file .eval_results/hle.yaml 6c2f5eb verified
Rename to hle_with_tools.yaml (separate from no-tools score) c68e98f verified
Add HLE evaluation result (with tools) 0edb852 verified
Add GPQA Eval Result (#37) 602d01e verified
Add MMLU-Pro evaluation result (#34) e2ee390 verified
Add HLE evaluation result (#31) 9258c1a verified
Update README.md: remove `extra_body` in disabling thinking (#27) 475d85c verified
add ed30813
zRzRzRzRzRzRzR commited on
update 49342a6
zRzRzRzRzRzRzR commited on
readme 26f32ad
zRzRzRzRzRzRzR commited on
fix readme 9fcd12a
zRzRzRzRzRzRzR commited on
new readme 2942382
zRzRzRzRzRzRzR commited on