AbstractPhil commited on
Commit
c7f5e93
Β·
verified Β·
1 Parent(s): ba8e071

Upload runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/README.md with huggingface_hub

Browse files
runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Run: cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737
2
+
3
+ ## Configuration
4
+ - **Dataset**: CIFAR100
5
+ - **Fusion Mode**: consciousness
6
+ - **Parameters**: 7,207,237
7
+ - **Simplex**: 8-simplex (9 vertices)
8
+
9
+ ## Performance
10
+ - **Best Validation Accuracy**: 64.34%
11
+ - **Training Time**: 3.5 hours
12
+ - **Final Epoch**: 200
13
+
14
+ ## Training Setup: AdamW + Warm Restarts
15
+ - **Optimizer**: AdamW (lr=0.0003, wd=0.05)
16
+ - **Scheduler**: CosineAnnealingWarmRestarts
17
+ - **Restart Period (T_0)**: 12 epochs
18
+ - **Cycle Multiplier (T_mult)**: 1.75x
19
+ - **Restart LR Mult**: 1.2x πŸš€
20
+ - **Min LR**: 1e-07
21
+ - **Batch Size**: 512
22
+ - **Mixed Precision**: False
23
+
24
+ ### πŸš€ LR Boost Feature
25
+
26
+ This run uses **restart_lr_mult = 1.2x** for aggressive exploration:
27
+
28
+ **How it works:**
29
+ ```
30
+ Cycle 1: 3.00e-04 β†’ 1.00e-07 (standard convergence)
31
+ Restart: β†’ 3.60e-04 (BOOSTED!)
32
+ Cycle 2: 3.60e-04 β†’ 1.00e-07 (wider exploration)
33
+ Restart: β†’ 4.32e-04 (EVEN MORE BOOSTED!)
34
+ Cycle 3: 4.32e-04 β†’ 1.00e-07
35
+ ...
36
+ ```
37
+
38
+ **Benefits:**
39
+ - πŸ”“ **Escape solidified local minima** with aggressive LR spikes
40
+ - 🌊 **Wider exploration curves** after each restart
41
+ - πŸ’ͺ **Progressively stronger exploration** as training proceeds
42
+ - 🎯 **Combat training plateaus** that plague long runs
43
+
44
+
45
+ ### Learning Rate Schedule
46
+ ```
47
+ Cycle 1: Epochs 0-12
48
+ LR: 0.0003 β†’ 1e-07 (drop)
49
+ Expected: Convergence to local minimum
50
+
51
+ Epoch 12: RESTART πŸ”„
52
+ LR: 1e-07 β†’ 0.00035999999999999997 (jump!)
53
+ Expected: Escape local minimum, explore new regions
54
+
55
+ Cycle 2: Epochs 12-33.0
56
+ LR: 0.00035999999999999997 β†’ 1e-07 (longer cycle)
57
+ Expected: Deeper convergence
58
+
59
+ ... and so on
60
+ ```
61
+
62
+ ## Files
63
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/checkpoints/best_model.safetensors` - Model weights
64
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/checkpoints/best_training_state.pt` - Optimizer state
65
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/config.yaml` - Full configuration
66
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/tensorboard/` - TensorBoard logs (LR tracking!)
67
+
68
+ ## Usage
69
+ ```python
70
+ from safetensors.torch import load_file
71
+ from huggingface_hub import hf_hub_download
72
+
73
+ model_path = hf_hub_download(
74
+ repo_id="AbstractPhil/vit-beans-v3",
75
+ filename="runs/cifar100_consciousness_ADAMW_WarmRestart_boost1.2x_20251123_063737/checkpoints/best_model.safetensors"
76
+ )
77
+
78
+ state_dict = load_file(model_path)
79
+ model.load_state_dict(state_dict)
80
+ ```
81
+
82
+ ## Training Notes
83
+
84
+ **Warm Restarts Benefits:**
85
+ - πŸ”„ **Exploration**: Periodic LR jumps escape local minima
86
+ - πŸ“‰ **Exploitation**: Long drop phases converge deeply
87
+ - 🎯 **Robustness**: Multiple restarts find better solutions
88
+ - πŸ“Š **Monitoring**: Watch TensorBoard for restart effects!
89
+
90
+ **Expected Behavior:**
91
+ - Accuracy improves during each drop phase
92
+ - Brief accuracy dips after restarts (exploration)
93
+ - Overall upward trend across cycles
94
+ - Best models often found late in long cycles
95
+
96
+ ---
97
+
98
+ Built with geometric consciousness-aware routing using the Devil's Staircase (Beatrix) and pentachoron parameterization.
99
+
100
+ **Training completed**: 2025-11-23 10:05:21
101
+
102
+ [← Back to main repository](https://huggingface.co/AbstractPhil/vit-beans-v3)