AbstractPhil commited on
Commit
c87a2c3
Β·
verified Β·
1 Parent(s): 5840165

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +26 -64
README.md CHANGED
@@ -5,6 +5,7 @@ tags:
5
  - geometric-deep-learning
6
  - safetensors
7
  - vision-transformer
 
8
  library_name: pytorch
9
  datasets:
10
  - cifar10
@@ -15,110 +16,71 @@ metrics:
15
 
16
  # vit-beans-v3
17
 
18
- **Geometric Deep Learning with Cantor Multihead Fusion**
19
 
20
- This repository contains multiple training runs using Cantor fusion architecture with pentachoron structures and geometric routing. All models use SafeTensors format for security.
21
 
22
- ## Repository Structure
 
 
 
 
 
 
 
 
23
  ```
24
- vit-beans-v3/
25
- β”œβ”€β”€ runs/
26
- β”‚ β”œβ”€β”€ cifar10_weighted_SGD_TIMESTAMP/
27
- β”‚ β”‚ β”œβ”€β”€ checkpoints/
28
- β”‚ β”‚ β”‚ β”œβ”€β”€ best_model.safetensors
29
- β”‚ β”‚ β”‚ β”œβ”€β”€ best_training_state.pt
30
- β”‚ β”‚ β”‚ └── best_metadata.json
31
- β”‚ β”‚ β”œβ”€β”€ tensorboard/
32
- β”‚ β”‚ β”œβ”€β”€ config.yaml
33
- β”‚ β”‚ └── README.md
34
- β”‚ └── ...
35
- └── README.md (this file)
36
  ```
37
 
38
  ## Current Run
39
 
40
- **Latest**: `cifar100_consciousness_SGD_20251119_182136`
41
  - **Dataset**: CIFAR100
42
- - **Fusion Mode**: consciousness
43
- - **Optimizer**: SGD (momentum=0.9)
44
- - **Scheduler**: MultiStepLR [60, 80]
45
- - **Architecture**: 6 blocks, 16 heads
46
- - **Simplex**: 7-simplex (8 vertices)
47
 
48
  ## Architecture
49
 
50
  The Cantor Fusion architecture uses:
51
  - **Geometric Routing**: Pentachoron (5-simplex) structures for token routing
52
  - **Cantor Multihead Fusion**: Multiple fusion heads with geometric attention
53
- - **Beatrix Consciousness Routing**: Optional consciousness-aware token fusion using the Devil's Staircase
54
- - **SafeTensors Format**: All model weights use SafeTensors (not pickle) for security
55
-
56
- ## Training Strategy
57
-
58
- This model uses the proven **SGD + milestone LR drops** strategy from WideResNet:
59
- - Initial LR: 0.01
60
- - Milestones: [60, 80]
61
- - Decay factor: 0.2 (LR *= 0.2 at each milestone)
62
- - This causes the dramatic accuracy jumps seen in deep networks!
63
 
64
  ## Usage
65
-
66
- ### Download a Model
67
  ```python
68
  from huggingface_hub import hf_hub_download
69
  from safetensors.torch import load_file
70
- import torch
71
 
72
- # Download model weights
73
  model_path = hf_hub_download(
74
  repo_id="AbstractPhil/vit-beans-v3",
75
  filename="runs/YOUR_RUN_NAME/checkpoints/best_model.safetensors"
76
  )
77
 
78
- # Load weights (SafeTensors - no pickle!)
79
  state_dict = load_file(model_path)
80
  model.load_state_dict(state_dict)
81
  ```
82
 
83
- ### Browse Runs
84
-
85
- Each run directory contains:
86
- - `checkpoints/` - Model weights (safetensors), training state, metadata
87
- - `tensorboard/` - TensorBoard logs for visualization
88
- - `config.yaml` - Complete training configuration
89
- - `README.md` - Run-specific details and results
90
-
91
- ## Model Variants
92
-
93
- - **Weighted Fusion**: Standard geometric fusion with learned weights
94
- - **Consciousness Fusion**: Uses Beatrix routing with consciousness emergence
95
-
96
  ## Citation
97
  ```bibtex
98
  @misc{vit_beans_v3,
99
  author = {AbstractPhil},
100
- title = {vit-beans-v3: Geometric Deep Learning with Cantor Fusion},
101
  year = {2025},
102
  publisher = {HuggingFace},
103
  url = {https://huggingface.co/AbstractPhil/vit-beans-v3}
104
  }
105
  ```
106
 
107
- ## Training Details
108
-
109
- Optimizer options:
110
- - **SGD**: High momentum (0.9), Nesterov, milestone-based LR drops
111
- - **AdamW**: Weight decay, cosine annealing with warmup
112
-
113
- All models trained with:
114
- - Mixed Precision: Available on A100
115
- - Augmentation: AutoAugment (CIFAR10 policy)
116
- - Format: SafeTensors (ClamAV safe)
117
-
118
- Built with geometric consciousness-aware routing using the Devil's Staircase (Beatrix) and pentachoron parameterization.
119
-
120
  ---
121
 
122
  **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
123
 
124
- **Latest update**: 2025-11-19 18:21:39
 
5
  - geometric-deep-learning
6
  - safetensors
7
  - vision-transformer
8
+ - warm-restarts
9
  library_name: pytorch
10
  datasets:
11
  - cifar10
 
16
 
17
  # vit-beans-v3
18
 
19
+ **Geometric Deep Learning with Cantor Multihead Fusion + AdamW Warm Restarts**
20
 
21
+ This repository contains multiple training runs using Cantor fusion architecture with pentachoron structures, geometric routing, and **CosineAnnealingWarmRestarts** for automatic exploration cycles.
22
 
23
+ ## Training Strategy: AdamW + Warm Restarts
24
+
25
+ This model uses **AdamW with Cosine Annealing Warm Restarts** (SGDR):
26
+ - **Drop phase**: LR decays from 0.0003 β†’ 1e-07 over 20 epochs
27
+ - **Restart phase**: LR jumps back to 0.0003 to explore new regions
28
+ - **Cycle multiplier**: Each cycle is 2x longer than previous
29
+ - **Benefits**: Automatic exploration + exploitation, finds better minima, robust training
30
+
31
+ ### Restart Schedule
32
  ```
33
+ Epochs 0-20: LR: 0.0003 β†’ 1e-07 (first cycle)
34
+ Epoch 20: LR: RESTART to 0.0003 πŸ”„
35
+ Epochs 20-60: LR: 0.0003 β†’ 1e-07 (longer cycle)
36
+ ...
 
 
 
 
 
 
 
 
37
  ```
38
 
39
  ## Current Run
40
 
41
+ **Latest**: `cifar100_weighted_ADAMW_WarmRestart_20251119_200210`
42
  - **Dataset**: CIFAR100
43
+ - **Fusion Mode**: weighted
44
+ - **Optimizer**: AdamW (adaptive moments)
45
+ - **Scheduler**: CosineAnnealingWarmRestarts
46
+ - **Architecture**: 6 blocks, 8 heads
47
+ - **Simplex**: 4-simplex (5 vertices)
48
 
49
  ## Architecture
50
 
51
  The Cantor Fusion architecture uses:
52
  - **Geometric Routing**: Pentachoron (5-simplex) structures for token routing
53
  - **Cantor Multihead Fusion**: Multiple fusion heads with geometric attention
54
+ - **Beatrix Consciousness Routing**: Optional consciousness-aware token fusion
55
+ - **SafeTensors Format**: All model weights use SafeTensors (not pickle)
 
 
 
 
 
 
 
 
56
 
57
  ## Usage
 
 
58
  ```python
59
  from huggingface_hub import hf_hub_download
60
  from safetensors.torch import load_file
 
61
 
 
62
  model_path = hf_hub_download(
63
  repo_id="AbstractPhil/vit-beans-v3",
64
  filename="runs/YOUR_RUN_NAME/checkpoints/best_model.safetensors"
65
  )
66
 
 
67
  state_dict = load_file(model_path)
68
  model.load_state_dict(state_dict)
69
  ```
70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ## Citation
72
  ```bibtex
73
  @misc{vit_beans_v3,
74
  author = {AbstractPhil},
75
+ title = {vit-beans-v3: Geometric Deep Learning with Warm Restarts},
76
  year = {2025},
77
  publisher = {HuggingFace},
78
  url = {https://huggingface.co/AbstractPhil/vit-beans-v3}
79
  }
80
  ```
81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  ---
83
 
84
  **Repository maintained by**: [@AbstractPhil](https://huggingface.co/AbstractPhil)
85
 
86
+ **Latest update**: 2025-11-19 20:02:13