gofilipa
/

LoveIsBlind_Pods

Model card Files Files and versions

gofilipa commited on Nov 26, 2025

Commit

f71dad9

·

verified ·

1 Parent(s): 647c053

Adding specs for training

Files changed (1) hide show

README.md +57 -3

README.md CHANGED Viewed

@@ -1,3 +1,57 @@
----
-license: gpl-3.0
----

+---
+license: gpl-3.0
+base_model:
+- openai-community/gpt2
+---
+Fine-tuning specs:
+```python
+training_params = SFTConfig(
+    output_dir="checkpoints",
+    per_device_train_batch_size=1,
+    per_device_eval_batch_size=1,
+    gradient_accumulation_steps=2,
+    num_train_epochs=3,
+    learning_rate=1e-4, # lowered from 2e-4 to 1e-4
+    weight_decay=0.001,
+    dataset_text_field="text",
+    report_to="none",
+    bf16=False,
+    fp16=False,
+    dataloader_pin_memory=False,
+    remove_unused_columns=False,
+    max_length=512,
+    gradient_checkpointing=True,
+    dataloader_num_workers=0,
+    save_strategy="epoch",
+    logging_steps=100,
+    average_tokens_across_devices=False  # Fix for single device training
+    # Remove loss_type parameter to avoid the warning
+    # The trainer will automatically use ForCausalLMLoss which is correct
+)
+# Configure model for gradient checkpointing compatibility
+model.config.use_cache = False
+trainer = SFTTrainer(
+    model=model,
+    train_dataset=ds['train'],
+    processing_class=tokenizer,
+    args=training_params
+)
+```
+Training outputs
+```python
+TrainOutput(
+global_step=16773,
+training_loss=2.056998251788356,
+metrics={
+  'train_runtime': 3255.1858,
+  'train_samples_per_second': 10.305,
+  'train_steps_per_second': 5.153,
+  'total_flos': 164188359936000.0,
+  'train_loss': 2.056998251788356})
+```