fox1.3 / evaluate.py

Commit History

Add evaluate.py: Benchmark evaluation on HumanEval + MBPP
f7a5fb7
verified

teolm30 commited on