sdar4b-rm-K1-esft-intent / train_results.json
autoprogrammer's picture
SDAR-4B random_mask K=1 ESFT-intent (final)
51d7357 verified
{
"effective_tokens_per_sec": 330.74608278063124,
"epoch": 3.0,
"total_flos": 5.409409894396723e+17,
"train_loss": 0.06834091498837834,
"train_runtime": 761.1394,
"train_samples_per_second": 28.694,
"train_steps_per_second": 0.225
}