BiBo-checkpoint / README.md
fhai50032's picture
Update README.md
ec84617 verified
metadata
license: mit
library_name: transformers

Global Batch size : 384 seq_len: 2048

Checkpoint every 500 steps

i.e every 393216000 tokens or 400M Tokens

Current Revison available as

  • checkpoint-500 393M
  • checkpoint-1000 786M
  • checkpoint-1500 1.18B
  • checkpoint-2000 1.57B
  • checkpoint-2500 1.96B

max_lr : 7e-5