Commit History
Remove duplicate file .eval_results/hle.yaml
6c2f5eb
verified
Rename to hle_with_tools.yaml (separate from no-tools score)
c68e98f
verified
Add HLE evaluation result (with tools)
0edb852
verified
Add GPQA Eval Result (#37)
602d01e
verified
Add MMLU-Pro evaluation result (#34)
e2ee390
verified
Add HLE evaluation result (#31)
9258c1a
verified
Update README.md: remove `extra_body` in disabling thinking (#27)
475d85c
verified
add
ed30813
zRzRzRzRzRzRzR
commited on
update
49342a6
zRzRzRzRzRzRzR
commited on
readme
26f32ad
zRzRzRzRzRzRzR
commited on
fix readme
9fcd12a
zRzRzRzRzRzRzR
commited on
new readme
2942382
zRzRzRzRzRzRzR
commited on