AbstractPhil commited on
Commit
2482a2f
Β·
verified Β·
1 Parent(s): f5e3e4c

Upload runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/README.md with huggingface_hub

Browse files
runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Run: cifar100_consciousness_ADAMW_WarmRestart_20251120_030614
2
+
3
+ ## Configuration
4
+ - **Dataset**: CIFAR100
5
+ - **Fusion Mode**: consciousness
6
+ - **Parameters**: 42,341,608
7
+ - **Simplex**: 4-simplex (5 vertices)
8
+
9
+ ## Performance
10
+ - **Best Validation Accuracy**: 45.96%
11
+ - **Training Time**: 6.1 hours
12
+ - **Final Epoch**: 200
13
+
14
+ ## Training Setup: AdamW + Warm Restarts
15
+ - **Optimizer**: AdamW (lr=0.0001, wd=0.05)
16
+ - **Scheduler**: CosineAnnealingWarmRestarts
17
+ - **Restart Period (T_0)**: 20 epochs
18
+ - **Cycle Multiplier (T_mult)**: 1x
19
+ - **Min LR**: 1e-07
20
+ - **Batch Size**: 256
21
+ - **Mixed Precision**: False
22
+
23
+ ### Learning Rate Schedule
24
+ ```
25
+ Cycle 1: Epochs 0-20
26
+ LR: 0.0001 β†’ 1e-07 (drop)
27
+ Expected: Convergence to local minimum
28
+
29
+ Epoch 20: RESTART πŸ”„
30
+ LR: 1e-07 β†’ 0.0001 (jump!)
31
+ Expected: Escape local minimum, explore new regions
32
+
33
+ Cycle 2: Epochs 20-40
34
+ LR: 0.0001 β†’ 1e-07 (longer cycle)
35
+ Expected: Deeper convergence
36
+
37
+ ... and so on
38
+ ```
39
+
40
+ ## Files
41
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/checkpoints/best_model.safetensors` - Model weights
42
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/checkpoints/best_training_state.pt` - Optimizer state
43
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/config.yaml` - Full configuration
44
+ - `runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/tensorboard/` - TensorBoard logs (LR tracking!)
45
+
46
+ ## Usage
47
+ ```python
48
+ from safetensors.torch import load_file
49
+ from huggingface_hub import hf_hub_download
50
+
51
+ model_path = hf_hub_download(
52
+ repo_id="AbstractPhil/vit-beans-v3",
53
+ filename="runs/cifar100_consciousness_ADAMW_WarmRestart_20251120_030614/checkpoints/best_model.safetensors"
54
+ )
55
+
56
+ state_dict = load_file(model_path)
57
+ model.load_state_dict(state_dict)
58
+ ```
59
+
60
+ ## Training Notes
61
+
62
+ **Warm Restarts Benefits:**
63
+ - πŸ”„ **Exploration**: Periodic LR jumps escape local minima
64
+ - πŸ“‰ **Exploitation**: Long drop phases converge deeply
65
+ - 🎯 **Robustness**: Multiple restarts find better solutions
66
+ - πŸ“Š **Monitoring**: Watch TensorBoard for restart effects!
67
+
68
+ **Expected Behavior:**
69
+ - Accuracy improves during each drop phase
70
+ - Brief accuracy dips after restarts (exploration)
71
+ - Overall upward trend across cycles
72
+ - Best models often found late in long cycles
73
+
74
+ ---
75
+
76
+ Built with geometric consciousness-aware routing using the Devil's Staircase (Beatrix) and pentachoron parameterization.
77
+
78
+ **Training completed**: 2025-11-20 09:15:30
79
+
80
+ [← Back to main repository](https://huggingface.co/AbstractPhil/vit-beans-v3)