adriey's picture
Duplicate from lightblue/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual
a8a8016 verified
raw
history blame contribute delete
346 Bytes
{
"epoch": 1.0,
"eval_loss": 0.9007720947265625,
"eval_runtime": 0.2597,
"eval_samples_per_second": 30.809,
"eval_steps_per_second": 3.851,
"total_flos": 3.270639151271117e+16,
"train_loss": 1.0707874030483013,
"train_runtime": 199.5779,
"train_samples_per_second": 3.908,
"train_steps_per_second": 0.491
}