Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -23,30 +23,30 @@ This repository contains multiple training runs using Cantor fusion architecture
|
|
| 23 |
## Training Strategy: AdamW + Warm Restarts
|
| 24 |
|
| 25 |
This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
|
| 26 |
-
- **Drop phase**: LR decays from 0.
|
| 27 |
-
- **Restart phase**: LR jumps back to 0.
|
| 28 |
- **Cycle multiplier**: Each cycle is 1.5x longer than previous
|
| 29 |
- **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
|
| 30 |
|
| 31 |
### π LR Boost at Restarts (NEW!)
|
| 32 |
This run uses **restart_lr_mult = 1.25x**:
|
| 33 |
- Normal restart: 3e-4 β 1e-7 β restart at 3e-4
|
| 34 |
-
- **Boosted restart**: 3e-4 β 1e-7 β restart at
|
| 35 |
- Creates **wider exploration curves** to escape solidified local minima
|
| 36 |
- Each restart provides progressively stronger exploration boost
|
| 37 |
|
| 38 |
|
| 39 |
### Restart Schedule
|
| 40 |
```
|
| 41 |
-
Epochs 0-40: LR: 0.
|
| 42 |
-
Epoch 40: LR: RESTART to 0.
|
| 43 |
-
Epochs 40-100.0: LR: 0.
|
| 44 |
...
|
| 45 |
```
|
| 46 |
|
| 47 |
## Current Run
|
| 48 |
|
| 49 |
-
**Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.
|
| 50 |
- **Dataset**: CIFAR100
|
| 51 |
- **Fusion Mode**: consciousness
|
| 52 |
- **Optimizer**: AdamW (adaptive moments)
|
|
@@ -92,4 +92,4 @@ model.load_state_dict(state_dict)
|
|
| 92 |
|
| 93 |
**Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
|
| 94 |
|
| 95 |
-
**Latest update**: 2025-11-22 02:
|
|
|
|
| 23 |
## Training Strategy: AdamW + Warm Restarts
|
| 24 |
|
| 25 |
This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
|
| 26 |
+
- **Drop phase**: LR decays from 0.0001 β 1e-07 over 40 epochs
|
| 27 |
+
- **Restart phase**: LR jumps back to 0.0001 to explore new regions
|
| 28 |
- **Cycle multiplier**: Each cycle is 1.5x longer than previous
|
| 29 |
- **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
|
| 30 |
|
| 31 |
### π LR Boost at Restarts (NEW!)
|
| 32 |
This run uses **restart_lr_mult = 1.25x**:
|
| 33 |
- Normal restart: 3e-4 β 1e-7 β restart at 3e-4
|
| 34 |
+
- **Boosted restart**: 3e-4 β 1e-7 β restart at 1.25e-04 (1.25x!)
|
| 35 |
- Creates **wider exploration curves** to escape solidified local minima
|
| 36 |
- Each restart provides progressively stronger exploration boost
|
| 37 |
|
| 38 |
|
| 39 |
### Restart Schedule
|
| 40 |
```
|
| 41 |
+
Epochs 0-40: LR: 0.0001 β 1e-07 (first cycle)
|
| 42 |
+
Epoch 40: LR: RESTART to 0.000125 π
|
| 43 |
+
Epochs 40-100.0: LR: 0.000125 β 1e-07 (longer cycle)
|
| 44 |
...
|
| 45 |
```
|
| 46 |
|
| 47 |
## Current Run
|
| 48 |
|
| 49 |
+
**Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_024915`
|
| 50 |
- **Dataset**: CIFAR100
|
| 51 |
- **Fusion Mode**: consciousness
|
| 52 |
- **Optimizer**: AdamW (adaptive moments)
|
|
|
|
| 92 |
|
| 93 |
**Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
|
| 94 |
|
| 95 |
+
**Latest update**: 2025-11-22 02:49:18
|