Commit History
Update .eval_results/gpqa.yaml 0dbd235 verified
Update .eval_results/hle_with_tools.yaml c3a7b73 verified
Update .eval_results/hle_with_tools.yaml a185b9b verified
Update .eval_results/hle_with_tools.yaml f7d34fb verified
Update .eval_results/hle.yaml 22c99c0 verified
Add evaluation results for GPQA, HLE 25f841a verified
Update README.md (#18) 360b49d
update 83d08ca
zRzRzRzRzRzRzR commited on
line 9329f32
zRzRzRzRzRzRzR commited on
sglang update 17b316b
zRzRzRzRzRzRzR commited on
work cce0887
zRzRzRzRzRzRzR commited on
init3 fbfe45b
zRzRzRzRzRzRzR commited on
init2 c35b39f
zRzRzRzRzRzRzR commited on
init2 1a68bb7
zRzRzRzRzRzRzR commited on
init baee9b2
zRzRzRzRzRzRzR commited on