moe-topk-xsum-gqa / best_model
862 MB
ishro's picture
Epoch 3: train_loss=3.5977, val_loss=3.5497
0b042f1 verified