prernac1
/

pretendparentai

Text Generation

Model card Files Files and versions

prernac1 commited on Oct 8, 2025

Commit

3babf25

·

verified ·

1 Parent(s): 21e54db

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -159,7 +159,7 @@ The model was trained on A100.
 - **Precision:** 4-bit quantization (NF4) with double quantization, compute in bfloat16
 - **Optimizer:** `paged_adamw_8bit`
 - **Scheduler:** Cosine learning rate decay with 3% warmup
-- **Batching:** Effective batch size of 16 (per_device_train_batch_size=6, gradient_accumulation_steps=4)
 - **Epochs:** 1–2 (best checkpoint after 1 epoch, ~1600 steps)
 - **Dropout:** 0.05 (LoRA)
 - **LoRA rank:** 16 (`r=16`), scaling factor `alpha=64`

 - **Precision:** 4-bit quantization (NF4) with double quantization, compute in bfloat16
 - **Optimizer:** `paged_adamw_8bit`
 - **Scheduler:** Cosine learning rate decay with 3% warmup
+- **Batching:** Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4)
 - **Epochs:** 1–2 (best checkpoint after 1 epoch, ~1600 steps)
 - **Dropout:** 0.05 (LoRA)
 - **LoRA rank:** 16 (`r=16`), scaling factor `alpha=64`