Qwen2.5-3B-R1-Distill / train_results.json
dekangli's picture
Model save
1770b82 verified
raw
history blame contribute delete
216 Bytes
{
"total_flos": 1913806159609856.0,
"train_loss": 0.48002010657061983,
"train_runtime": 43121.1313,
"train_samples": 93733,
"train_samples_per_second": 2.174,
"train_steps_per_second": 0.017
}