Spaces:

miyuki2026
/

OpenMiniMind

Sleeping

miyuki2026 commited on 16 days ago

Commit

44f10dc

1 Parent(s): ce3fdcb

update

Files changed (1) hide show

examples/tutorials/dpo/ultrafeedback-dpo/step_2_train_dpo_model_unsloth_ddp_qlora.py CHANGED Viewed

@@ -235,8 +235,8 @@ def main():
     dpo_config = DPOConfig(
         output_dir=args.output_model_dir,
         num_train_epochs=args.num_train_epochs,
-        per_device_train_batch_size=1 if debug_mode else 2,
-        per_device_eval_batch_size=1 if debug_mode else 2,
         gradient_accumulation_steps=1 if debug_mode else 8,
         # gradient_checkpointing=True,
         # gradient_checkpointing_kwargs={"use_reentrant": False},

     dpo_config = DPOConfig(
         output_dir=args.output_model_dir,
         num_train_epochs=args.num_train_epochs,
+        per_device_train_batch_size=1 if debug_mode else 3,
+        per_device_eval_batch_size=1 if debug_mode else 3,
         gradient_accumulation_steps=1 if debug_mode else 8,
         # gradient_checkpointing=True,
         # gradient_checkpointing_kwargs={"use_reentrant": False},