sdar4b-random-mask-esft-intent / train_results.json
autoprogrammer's picture
SDAR-4B random_mask SFT on ESFT-intent (final)
191c270 verified
{
"effective_tokens_per_sec": 147.5977248231877,
"epoch": 3.0,
"total_flos": 5.409409897081078e+17,
"train_loss": 0.1565390659049589,
"train_runtime": 1705.6081,
"train_samples_per_second": 12.805,
"train_steps_per_second": 0.201
}