AbstractPhil commited on
Commit
bad8383
Β·
verified Β·
1 Parent(s): 002f673

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -23,30 +23,30 @@ This repository contains multiple training runs using Cantor fusion architecture
23
  ## Training Strategy: AdamW + Warm Restarts
24
 
25
  This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
26
- - **Drop phase**: LR decays from 0.0003 β†’ 1e-07 over 40 epochs
27
- - **Restart phase**: LR jumps back to 0.0003 to explore new regions
28
  - **Cycle multiplier**: Each cycle is 1.5x longer than previous
29
  - **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
30
 
31
  ### πŸš€ LR Boost at Restarts (NEW!)
32
  This run uses **restart_lr_mult = 1.25x**:
33
  - Normal restart: 3e-4 β†’ 1e-7 β†’ restart at 3e-4
34
- - **Boosted restart**: 3e-4 β†’ 1e-7 β†’ restart at 3.75e-04 (1.25x!)
35
  - Creates **wider exploration curves** to escape solidified local minima
36
  - Each restart provides progressively stronger exploration boost
37
 
38
 
39
  ### Restart Schedule
40
  ```
41
- Epochs 0-40: LR: 0.0003 β†’ 1e-07 (first cycle)
42
- Epoch 40: LR: RESTART to 0.00037499999999999995 πŸ”„
43
- Epochs 40-100.0: LR: 0.00037499999999999995 β†’ 1e-07 (longer cycle)
44
  ...
45
  ```
46
 
47
  ## Current Run
48
 
49
- **Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_024834`
50
  - **Dataset**: CIFAR100
51
  - **Fusion Mode**: consciousness
52
  - **Optimizer**: AdamW (adaptive moments)
@@ -92,4 +92,4 @@ model.load_state_dict(state_dict)
92
 
93
  **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
94
 
95
- **Latest update**: 2025-11-22 02:48:37
 
23
  ## Training Strategy: AdamW + Warm Restarts
24
 
25
  This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
26
+ - **Drop phase**: LR decays from 0.0001 β†’ 1e-07 over 40 epochs
27
+ - **Restart phase**: LR jumps back to 0.0001 to explore new regions
28
  - **Cycle multiplier**: Each cycle is 1.5x longer than previous
29
  - **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
30
 
31
  ### πŸš€ LR Boost at Restarts (NEW!)
32
  This run uses **restart_lr_mult = 1.25x**:
33
  - Normal restart: 3e-4 β†’ 1e-7 β†’ restart at 3e-4
34
+ - **Boosted restart**: 3e-4 β†’ 1e-7 β†’ restart at 1.25e-04 (1.25x!)
35
  - Creates **wider exploration curves** to escape solidified local minima
36
  - Each restart provides progressively stronger exploration boost
37
 
38
 
39
  ### Restart Schedule
40
  ```
41
+ Epochs 0-40: LR: 0.0001 β†’ 1e-07 (first cycle)
42
+ Epoch 40: LR: RESTART to 0.000125 πŸ”„
43
+ Epochs 40-100.0: LR: 0.000125 β†’ 1e-07 (longer cycle)
44
  ...
45
  ```
46
 
47
  ## Current Run
48
 
49
+ **Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_024915`
50
  - **Dataset**: CIFAR100
51
  - **Fusion Mode**: consciousness
52
  - **Optimizer**: AdamW (adaptive moments)
 
92
 
93
  **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
94
 
95
+ **Latest update**: 2025-11-22 02:49:18