abshetty
/

llava-lora-12-05-lr3

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

abshetty commited on Dec 6, 2024

Commit

df44985

·

verified ·

1 Parent(s): 0c33889

Update README.md

Files changed (1) hide show

README.md +38 -1

README.md CHANGED Viewed

@@ -65,4 +65,41 @@ Cite TRL as:
 	publisher    = {GitHub},
 	howpublished = {\url{https://github.com/huggingface/trl}}
 }
-```

 	publisher    = {GitHub},
 	howpublished = {\url{https://github.com/huggingface/trl}}
 }
+```
+#Train the model
+training_args = DPOConfig(
+    output_dir="llava-lora-12-05-dropout",
+    bf16=True,
+    gradient_checkpointing=True,
+    per_device_train_batch_size=8,
+    per_device_eval_batch_size=4,
+    gradient_accumulation_steps=32,
+    evaluation_strategy="steps",
+    eval_steps=1,
+    learning_rate=2e-6,
+    beta=0.1,
+    warmup_ratio=0.1,
+    lr_scheduler_type="cosine",
+    num_train_epochs=3,
+    dataset_num_proc=32,  # tokenization will use 32 processes
+    dataloader_num_workers=32,  # data loading will use 32 workers
+    logging_steps=1,
+)
+#Define LoRA configuration with specified rank
+lora_config = LoraConfig(
+    r=64,  # Set rank to 64
+    lora_alpha=128,  # Set scaling factor to 128
+    target_modules="all-linear",  # Target all linear layers
+    lora_dropout=0.1,
+)
+trainer = DPOTrainer(
+    model,
+    ref_model=None,  # not needed when using peft
+    args=training_args,
+    train_dataset=train_dataset,
+    eval_dataset=eval_dataset,
+    tokenizer=processor,
+    peft_config=lora_config,
+)