SWE-UT-4B-Qwen3-Coder-Distill / all_results.json
mts666's picture
Upload model files
3dd24ce verified
raw
history blame contribute delete
205 Bytes
{
"epoch": 4.0,
"total_flos": 273231709208576.0,
"train_loss": 0.20862414125496379,
"train_runtime": 10922.1006,
"train_samples_per_second": 1.206,
"train_steps_per_second": 0.019
}