AbstractPhil
/

vit-beans-v3

Image Classification

geometric-deep-learning

vision-transformer

geometric-coalescence

Model card Files Files and versions

Metrics Training metrics Community

AbstractPhil commited on Nov 22, 2025

Commit

bad8383

·

verified ·

1 Parent(s): 002f673

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -23,30 +23,30 @@ This repository contains multiple training runs using Cantor fusion architecture
 ## Training Strategy: AdamW + Warm Restarts
 This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
-- **Drop phase**: LR decays from 0.0003 → 1e-07 over 40 epochs
-- **Restart phase**: LR jumps back to 0.0003 to explore new regions
 - **Cycle multiplier**: Each cycle is 1.5x longer than previous
 - **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
 ### 🚀 LR Boost at Restarts (NEW!)
 This run uses **restart_lr_mult = 1.25x**:
 - Normal restart: 3e-4 → 1e-7 → restart at 3e-4
-- **Boosted restart**: 3e-4 → 1e-7 → restart at 3.75e-04 (1.25x!)
 - Creates **wider exploration curves** to escape solidified local minima
 - Each restart provides progressively stronger exploration boost
 ### Restart Schedule
 ```
-Epochs 0-40:   LR: 0.0003 → 1e-07 (first cycle)
-Epoch 40:      LR: RESTART to 0.00037499999999999995 🔄
-Epochs 40-100.0: LR: 0.00037499999999999995 → 1e-07 (longer cycle)
 ...
 ```
 ## Current Run
-**Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_024834`
 - **Dataset**: CIFAR100
 - **Fusion Mode**: consciousness
 - **Optimizer**: AdamW (adaptive moments)
@@ -92,4 +92,4 @@ model.load_state_dict(state_dict)
 **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
-**Latest update**: 2025-11-22 02:48:37

 ## Training Strategy: AdamW + Warm Restarts
 This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
+- **Drop phase**: LR decays from 0.0001 → 1e-07 over 40 epochs
+- **Restart phase**: LR jumps back to 0.0001 to explore new regions
 - **Cycle multiplier**: Each cycle is 1.5x longer than previous
 - **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
 ### 🚀 LR Boost at Restarts (NEW!)
 This run uses **restart_lr_mult = 1.25x**:
 - Normal restart: 3e-4 → 1e-7 → restart at 3e-4
+- **Boosted restart**: 3e-4 → 1e-7 → restart at 1.25e-04 (1.25x!)
 - Creates **wider exploration curves** to escape solidified local minima
 - Each restart provides progressively stronger exploration boost
 ### Restart Schedule
 ```
+Epochs 0-40:   LR: 0.0001 → 1e-07 (first cycle)
+Epoch 40:      LR: RESTART to 0.000125 🔄
+Epochs 40-100.0: LR: 0.000125 → 1e-07 (longer cycle)
 ...
 ```
 ## Current Run
+**Latest**: `cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_024915`
 - **Dataset**: CIFAR100
 - **Fusion Mode**: consciousness
 - **Optimizer**: AdamW (adaptive moments)
 **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
+**Latest update**: 2025-11-22 02:49:18