forgetting_gate_2_4_256 / metrics /jsonlines /train_data_info.jsonl
Lanni-ni's picture
add remote code + model files
d0f2865 verified
raw
history blame contribute delete
349 Bytes
{"step": 0, "train_data_info/vocab_size": 50277, "train_data_info/global_tokens_per_batch": 2097152, "train_data_info/local_tokens_per_batch": 2097152, "train_data_info/batch_len": 2048, "train_data_info/seq_len": 2048, "train_data_info/total_tokens": 2055208960, "train_data_info/global_batch_size": 1024, "train_data_info/local_batch_size": 1024}