AbstractPhil commited on
Commit
b245f96
Β·
verified Β·
1 Parent(s): 7abdacd

Epoch 50: 44.56%

Browse files
Files changed (1) hide show
  1. README.md +239 -0
README.md ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - image-classification
5
+ - cifar100
6
+ - geometric-learning
7
+ - fractal-encoding
8
+ - in-training
9
+ - no-attention
10
+ - no-cross-entropy
11
+ datasets:
12
+ - cifar100
13
+ metrics:
14
+ - accuracy
15
+ library_name: pytorch
16
+ pipeline_tag: image-classification
17
+ model-index:
18
+ - name: geo-beatrix-resnet18
19
+ results:
20
+ - task:
21
+ type: image-classification
22
+ name: Image Classification
23
+ dataset:
24
+ name: CIFAR-100
25
+ type: cifar100
26
+ metrics:
27
+ - type: accuracy
28
+ value: 44.56
29
+ name: Test Accuracy
30
+ verified: false
31
+ ---
32
+
33
+ # geo-beatrix-resnet18
34
+
35
+ **Geometric Basin Classification for CIFAR-100**
36
+
37
+ 🚧 **Training in Progress** 🚧
38
+
39
+ Current Status: Epoch 50/200
40
+
41
+ ---
42
+
43
+ ## Current Performance
44
+
45
+ | Metric | Value |
46
+ |--------|-------|
47
+ | **Best Test Accuracy** | **44.56%** |
48
+ | **Best Epoch** | 50 |
49
+ | **Current Train Accuracy** | 44.41% |
50
+ | **Current Test Accuracy** | 44.56% |
51
+ | **Current Ξ± (Cantor param)** | 0.4306 |
52
+ | **Total Parameters** | 11,952,641 |
53
+ | **Training Time** | 0:07:29 |
54
+
55
+ ### All Training Runs
56
+
57
+ | Timestamp | Status | Best Epoch | Test Acc | Train Acc | Ξ± |
58
+ |-----------|--------|------------|----------|-----------|---|
59
+ | `20251010_185133` | πŸ”„ | 50 | **44.56%** | 44.41% | 0.4306 |
60
+
61
+ ### Comparison to State-of-the-Art
62
+
63
+ | Model | Accuracy | Status |
64
+ |-------|----------|--------|
65
+ | **geo-beatrix (this model)** | **44.56%** | πŸ”„ Training |
66
+ | vit-beatrix-dualstream | 66.0% | Vision Transformer + Cross-Entropy |
67
+
68
+ 🎯 **Current target**: Beat vit-beatrix (66.0%) - Currently -21.44%
69
+
70
+ ---
71
+
72
+ ## Architecture
73
+
74
+ - **Base**: ResNet18 (torchvision)
75
+ - **Pretrained**: From scratch
76
+ - **Features**: 512-dim from ResNet18
77
+ - **Positional Encoding**: Devil's Staircase (Cantor function, 1883)
78
+ - **PE Levels**: 18
79
+ - **PE Features/Level**: 100
80
+ - **Classification**: Geometric Basin Compatibility (NO cross-entropy)
81
+ - **Attention Mechanisms**: NONE
82
+ - **Mixing**: Fractal (triadic multi-patch)
83
+
84
+ ---
85
+
86
+ ## Training Configuration
87
+
88
+ ```json
89
+ {
90
+ "model_name": "geo-beatrix-resnet18",
91
+ "model_type": "geometric_basin_classifier",
92
+ "num_classes": 100,
93
+ "batch_size": 512,
94
+ "num_epochs": 200,
95
+ "base_learning_rate": 0.002,
96
+ "weight_decay": 0.05,
97
+ "warmup_epochs": 10,
98
+ "pe_levels": 18,
99
+ "pe_features_per_level": 100,
100
+ "dropout": 0.1,
101
+ "pretrained_resnet": false,
102
+ "a100_optimizations": {
103
+ "mixed_precision": true,
104
+ "torch_compile": false,
105
+ "channels_last": true,
106
+ "gradient_checkpointing": false
107
+ },
108
+ "alphamix": {
109
+ "enabled": true,
110
+ "fractal_mode": true,
111
+ "range": [
112
+ 0.3,
113
+ 0.7
114
+ ],
115
+ "spatial_ratio": 0.25,
116
+ "curriculum_start": 0.0,
117
+ "curriculum_end": 0.5,
118
+ "fractal_steps": [
119
+ 1,
120
+ 3
121
+ ],
122
+ "fractal_scales": [
123
+ 0.3333333333333333,
124
+ 0.1111111111111111,
125
+ 0.037037037037037035
126
+ ]
127
+ },
128
+ "architecture": "ResNet18 + Devil's Staircase PE",
129
+ "loss_function": "Geometric Basin Compatibility",
130
+ "cross_entropy": false,
131
+ "attention_mechanisms": false,
132
+ "timestamp": "20251010_185133"
133
+ }
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Files Structure
139
+
140
+ ```
141
+ β”œβ”€β”€ model.pt (BEST overall model - easy access!)
142
+ β”œβ”€β”€ model.safetensors (BEST overall model - easy access!)
143
+ β”œβ”€β”€ best_model_info.json (which epoch/run this came from)
144
+ β”œβ”€β”€ runs_history.json (all training runs and their results)
145
+ β”œβ”€β”€ README.md
146
+ β”œβ”€β”€ weights/geo-beatrix-resnet18/20251010_185133/
147
+ β”‚ β”œβ”€β”€ model.pt (best from this training run)
148
+ β”‚ β”œβ”€β”€ model.safetensors (best from this training run)
149
+ β”‚ β”œβ”€β”€ config.json
150
+ β”‚ β”œβ”€β”€ training_log.txt
151
+ β”‚ └── checkpoints/
152
+ β”‚ β”œβ”€β”€ checkpoint_epoch_50.safetensors
153
+ β”‚ β”œβ”€β”€ checkpoint_epoch_100.safetensors
154
+ β”‚ └── checkpoint_epoch_150.safetensors
155
+ β”‚ (snapshots every 10 epochs)
156
+ └── runs/geo-beatrix-resnet18/20251010_185133/
157
+ β”œβ”€β”€ events.out.tfevents.* (TensorBoard logs)
158
+ └── metrics.csv (training metrics)
159
+ ```
160
+
161
+ **Note**: The root `model.pt` and `model.safetensors` always contain the best model across all training runs!
162
+
163
+ ---
164
+
165
+ ## Usage
166
+
167
+ ```python
168
+ from huggingface_hub import hf_hub_download
169
+ import torch
170
+
171
+ # EASIEST: Download BEST overall model from root (recommended!)
172
+ from safetensors.torch import load_file
173
+ model_path = hf_hub_download(
174
+ repo_id="AbstractPhil/geo-beatrix-resnet",
175
+ filename="model.safetensors"
176
+ )
177
+ state_dict = load_file(model_path)
178
+ # model.load_state_dict(state_dict)
179
+
180
+ # Check which epoch/run the best model came from
181
+ info_path = hf_hub_download(
182
+ repo_id="AbstractPhil/geo-beatrix-resnet",
183
+ filename="best_model_info.json"
184
+ )
185
+ with open(info_path) as f:
186
+ best_info = json.load(f)
187
+ print(f"Best model: epoch {best_info['epoch']}, {best_info['test_accuracy']:.2f}%")
188
+
189
+ # Or download from specific training run
190
+ model_path = hf_hub_download(
191
+ repo_id="AbstractPhil/geo-beatrix-resnet",
192
+ filename="weights/geo-beatrix-resnet18/20251010_185133/model.safetensors"
193
+ )
194
+
195
+ # Download specific epoch checkpoint
196
+ epoch_checkpoint = hf_hub_download(
197
+ repo_id="AbstractPhil/geo-beatrix-resnet",
198
+ filename="weights/geo-beatrix-resnet18/20251010_185133/checkpoints/checkpoint_epoch_100.safetensors"
199
+ )
200
+ ```
201
+
202
+ ---
203
+
204
+ ## Training History
205
+
206
+ ### Best Checkpoint
207
+ - Epoch: 50
208
+ - Train Acc: 44.41%
209
+ - Test Acc: 44.56%
210
+ - Alpha: 0.4306
211
+ - Loss: 1.4445
212
+
213
+ ### Latest 5 Epochs
214
+
215
+ - **Epoch 46**: Train 44.08%, Test 0.00%, Ξ±=0.4274, Loss=1.5477
216
+ - **Epoch 47**: Train 45.06%, Test 0.00%, Ξ±=0.4317, Loss=1.6100
217
+ - **Epoch 48**: Train 44.08%, Test 0.00%, Ξ±=0.4306, Loss=1.5218
218
+ - **Epoch 49**: Train 45.15%, Test 0.00%, Ξ±=0.4319, Loss=1.5274
219
+ - **Epoch 50**: Train 44.41%, Test 44.56%, Ξ±=0.4306, Loss=1.4445
220
+
221
+ ### Training Milestones
222
+ - πŸ“Š **Ξ± β‰₯ 0.40** reached at epoch 10
223
+
224
+ ---
225
+
226
+ ## Innovation
227
+
228
+ βœ… **NO attention mechanisms**
229
+ βœ… **NO cross-entropy loss**
230
+ βœ… **Fractal positional encoding** (Cantor function from 1883)
231
+ βœ… **Geometric compatibility classification**
232
+ βœ… **ResNet18 backbone** (proven CNN architecture)
233
+ βœ… **Triadic fractal mixing** (base-3 aligned)
234
+
235
+ ---
236
+
237
+ **Repository**: https://huggingface.co/AbstractPhil/geo-beatrix-resnet
238
+ **Author**: AbstractPhil
239
+ **Framework**: PyTorch