Viharikvs
/

CMBATRM

@@ -17,8 +17,8 @@ metrics: [loss, lm_loss, ponder_loss, perplexity_lm]
 Note: This model uses the T5 SentencePiece tokenizer. Perplexity numbers on WT103
 reported here are not directly comparable to GPT-2 BPE-based PPLs.
-### Latest Performance (Epoch 0)
-- **Validation Loss**: 4.8829
-- **Validation LM Loss**: 4.8728
-- **Validation Ponder Loss**: 1.0091
-- **Validation Perplexity (LM-only)**: 130.69

 Note: This model uses the T5 SentencePiece tokenizer. Perplexity numbers on WT103
 reported here are not directly comparable to GPT-2 BPE-based PPLs.
+### Latest Performance (Epoch 1)
+- **Validation Loss**: 4.8248
+- **Validation LM Loss**: 4.8149
+- **Validation Ponder Loss**: 1.0064
+- **Validation Perplexity (LM-only)**: 123.34