b1_math_top_8 / train_results.json
neginr's picture
End of training
c3ca317 verified
{
"epoch": 5.0,
"total_flos": 1.962803781229871e+18,
"train_loss": 0.30276708501553246,
"train_runtime": 23532.6537,
"train_samples_per_second": 6.714,
"train_steps_per_second": 0.052
}