Add model card: val_loss=4.446 ppl=85.3, scaling law data point 1cd3d0b verified LisaMegaWatts commited on 2 days ago
Best checkpoint: val_loss=4.4460 ppl=85.3 at step 4932 fd5039b verified LisaMegaWatts commited on 2 days ago