agentlans commited on
Commit
96e69fb
·
verified ·
1 Parent(s): 34bc7a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -61,6 +61,17 @@ Venus orbits the Sun in just 183 days (about 243 Earth days), but it's the large
61
  - Wild variations between runs
62
  - Partly censored/uncensored
63
 
 
 
 
 
 
 
 
 
 
 
 
64
  ## Licence
65
 
66
  Apache 2.0
 
61
  - Wild variations between runs
62
  - Partly censored/uncensored
63
 
64
+ ## Key Training Settings
65
+
66
+ - bf16 precision, cutoff length 2048
67
+ - LoRA fine-tuning: rank 16, alpha 32, dropout 0
68
+ - Learning rate: 5e-5 with cosine scheduler
69
+ - Gradient accumulation steps: 8, batch size: 1
70
+ - Flash attention (fa2), neat packing enabled
71
+ - One epoch over 100K samples
72
+ - Optimizer: AdamW (PyTorch)
73
+ - Warmup steps: 0
74
+
75
  ## Licence
76
 
77
  Apache 2.0