12d-4096v-40data / README.md
duoduoyeah's picture
Create README.md
9afd3ab verified

python -m scripts.base_train
--depth=12
--max_seq_len=1024
--device_batch_size=128
--target_param_data_ratio=40