alf-combine / eval_agent_test.json

Commit History

Add alfworld combine (bidirectional) PRM, Qwen3-1.7B, test macro-F1=91.4
8890dbe
verified

wls04 commited on