DeepSeek-R1-Distill-Qwen-7B-BG / train_results.json
senfu's picture
Upload folder using huggingface_hub
bc7246f verified
{
"total_flos": 3707122466086912.0,
"train_loss": 0.8589762968989311,
"train_runtime": 56498.3641,
"train_samples": 225129,
"train_samples_per_second": 7.965,
"train_steps_per_second": 0.996
}