sdar4b-ts-K1-esft-law / train_results.json
autoprogrammer's picture
SDAR-4B trace_sft K=1 ESFT-law (final)
33244d3 verified
raw
history blame
258 Bytes
{
"effective_tokens_per_sec": 934.667920697108,
"epoch": 3.0,
"total_flos": 2.799606877430743e+17,
"train_loss": 0.49932229929956895,
"train_runtime": 513.7568,
"train_samples_per_second": 21.652,
"train_steps_per_second": 0.169
}