Commit History

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6
e4ad155
verified

thomasjhuang commited on