nadiva1243 commited on
Commit
21d45d4
·
verified ·
1 Parent(s): 76e9cef

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -68,7 +68,7 @@ The **full MonkeyGrab application repository is not public yet** (defense / publ
68
  - **Sequence limits:** `max_length` 4,096 tokens; context truncated to **2,048** tokens; generation up to **2,048** new tokens.
69
  - **Optimizer / schedule:** AdamW 8-bit, **lr** 5e-5, **cosine** decay with **warmup_ratio** 0.05, **weight_decay** 0.01, **max_grad_norm** 1.0.
70
  - **Batching:** `per_device_train_batch_size` 1, **gradient_accumulation_steps** 16 → **effective batch 16**; **bf16** + **TF32**; gradient checkpointing enabled.
71
- - **Epochs:** 3; checkpoints saved every **300** steps (keep last 3); eval every **150** steps; **load_best_model_at_end** on `eval_loss`; **early stopping** patience **5** evaluations.
72
 
73
  ### Evaluation protocol
74
 
 
68
  - **Sequence limits:** `max_length` 4,096 tokens; context truncated to **2,048** tokens; generation up to **2,048** new tokens.
69
  - **Optimizer / schedule:** AdamW 8-bit, **lr** 5e-5, **cosine** decay with **warmup_ratio** 0.05, **weight_decay** 0.01, **max_grad_norm** 1.0.
70
  - **Batching:** `per_device_train_batch_size` 1, **gradient_accumulation_steps** 16 → **effective batch 16**; **bf16** + **TF32**; gradient checkpointing enabled.
71
+ - **Epochs:** 3; checkpoints saved every **300** steps (keep last 3); eval every **150** steps; **load_best_model_at_end** on `eval_loss`; **early stopping** patience **3** evaluations.
72
 
73
  ### Evaluation protocol
74