CoVT-Phase2-3expert-Full / scripts /zero2_pro6000.json
Steven668866's picture
training scripts (sft_phase2.sh, deepspeed config, env)
915d2ca verified
Raw
History Blame Contribute Delete
489 Bytes
{
"zero_optimization": {
"stage": 2,
"allgather_partitions": true,
"reduce_scatter": true,
"overlap_comm": false,
"contiguous_gradients": true,
"reduce_bucket_size": 500000000.0,
"allgather_bucket_size": 500000000.0
},
"bf16": {
"enabled": true
},
"gradient_clipping": 1.0,
"train_micro_batch_size_per_gpu": "auto",
"gradient_accumulation_steps": "auto",
"steps_per_print": 50,
"wall_clock_breakdown": false,
"train_batch_size": "auto"
}