AbstractPhil commited on
Commit
bf38616
·
verified ·
1 Parent(s): 56298ef

Upload weights - GeoFractalDavid-Basin-k50 - Run 20251016_011725 - Acc 67.78%

Browse files
weights/GeoFractalDavid-Basin-k50/20251016_011725/README.md ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - image-classification
6
+ - imagenet
7
+ - geometric-basin
8
+ - cantor-coherence
9
+ - multi-scale
10
+ - geofractaldavid
11
+ datasets:
12
+ - imagenet-1k
13
+ metrics:
14
+ - accuracy
15
+ library_name: pytorch
16
+ model-index:
17
+ - name: GeoFractalDavid-Basin-k50
18
+ results:
19
+ - task:
20
+ type: image-classification
21
+ dataset:
22
+ name: ImageNet-1K
23
+ type: imagenet-1k
24
+ metrics:
25
+ - type: accuracy
26
+ value: 67.78
27
+ name: Validation Accuracy
28
+ ---
29
+
30
+ # GeoFractalDavid-Basin-k50: Geometric Basin Classification
31
+
32
+ **GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
33
+ Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
34
+
35
+ ## 🎯 Performance
36
+
37
+ - **Best Validation Accuracy**: 67.78%
38
+ - **Epoch**: 2/10
39
+ - **Training Time**: 4m
40
+
41
+ ### Per-Scale Performance
42
+ - **Scale 448D**: 65.68%
43
+ - **Scale 512D**: 65.72%
44
+ - **Scale 576D**: 66.88%
45
+ - **Scale 640D**: 65.49%
46
+ - **Scale 704D**: 66.07%
47
+ - **Scale 768D**: 65.25%
48
+
49
+
50
+ ## 🏗️ Architecture
51
+
52
+ **Model Type**: Multi-scale geometric basin classifier
53
+
54
+ **Core Components**:
55
+ - **Feature Dimension**: 512
56
+ - **Number of Classes**: 1000
57
+ - **k-Simplex Structure**: k=50 (51 vertices per class)
58
+ - **Scales**: [448, 512, 576, 640, 704, 768]
59
+ - **Total Simplex Vertices**: 51,000
60
+
61
+ **Geometric Components**:
62
+ 1. **Feature Similarity**: Cosine similarity to k-simplex centroids
63
+ 2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
64
+ 3. **Crystal Geometry**: Distance to nearest simplex vertex
65
+
66
+ Each scale learns to weight these components differently.
67
+
68
+ ## 🔬 Learned Structure
69
+
70
+ ### Alpha Convergence (Global Cantor Stairs)
71
+
72
+ The alpha parameter controls middle-interval weighting in the Cantor staircase.
73
+
74
+ - **Initial**: 0.3301
75
+ - **Final**: 0.3377
76
+ - **Change**: +0.0076
77
+ - **Converged to 0.5**: False
78
+
79
+ The Cantor staircase uses soft triadic decomposition with learnable alpha to map
80
+ features into [0,1] space with fractal structure.
81
+
82
+ ### Cantor Prototype Distribution
83
+
84
+ Each class has a learned scalar Cantor prototype. The model pulls features toward
85
+ their class's Cantor position.
86
+
87
+ **Scale 448D**:
88
+ - Mean: 0.3299
89
+ - Std: 0.1153
90
+ - Range: [0.0698, 0.5232]
91
+
92
+ **Scale 512D**:
93
+ - Mean: 0.3303
94
+ - Std: 0.1152
95
+ - Range: [0.0707, 0.5232]
96
+
97
+ **Scale 576D**:
98
+ - Mean: 0.3406
99
+ - Std: 0.1138
100
+ - Range: [0.0846, 0.5392]
101
+
102
+ **Scale 640D**:
103
+ - Mean: 0.3284
104
+ - Std: 0.1156
105
+ - Range: [0.0675, 0.5210]
106
+
107
+ **Scale 704D**:
108
+ - Mean: 0.3376
109
+ - Std: 0.1141
110
+ - Range: [0.0799, 0.5346]
111
+
112
+ **Scale 768D**:
113
+ - Mean: 0.3321
114
+ - Std: 0.1149
115
+ - Range: [0.0728, 0.5256]
116
+
117
+
118
+ Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
119
+ This creates a continuous manifold rather than discrete bins.
120
+
121
+ ### Geometric Weight Evolution
122
+
123
+ Each scale learns optimal weights for combining geometric components:
124
+
125
+ **Scale 448D**: Feature=0.653, Cantor=0.071, Crystal=0.276
126
+ **Scale 512D**: Feature=0.610, Cantor=0.072, Crystal=0.318
127
+ **Scale 576D**: Feature=0.879, Cantor=0.026, Crystal=0.096
128
+ **Scale 640D**: Feature=0.578, Cantor=0.071, Crystal=0.351
129
+ **Scale 704D**: Feature=0.822, Cantor=0.030, Crystal=0.148
130
+ **Scale 768D**: Feature=0.668, Cantor=0.048, Crystal=0.285
131
+
132
+
133
+ **Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
134
+ This hierarchical strategy emerges from training.
135
+
136
+ ## 💻 Usage
137
+
138
+ ```python
139
+ import torch
140
+ from safetensors.torch import load_file
141
+ from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
142
+
143
+ # Load model
144
+ model = GeoFractalDavid(
145
+ feature_dim=512,
146
+ num_classes=1000,
147
+ k=5,
148
+ scales=[256, 384, 512, 768, 1024, 1280],
149
+ alpha_init=0.5,
150
+ tau=0.25
151
+ )
152
+
153
+ state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
154
+ model.load_state_dict(state_dict)
155
+ model.eval()
156
+
157
+ # Inference
158
+ with torch.no_grad():
159
+ logits = model(features) # [batch_size, 1000]
160
+ predictions = logits.argmax(dim=-1)
161
+
162
+ # Inspect learned structure
163
+ print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
164
+ geo_weights = model.get_geometric_weights()
165
+ cantor_dist = model.get_cantor_interval_distribution(sample_features)
166
+ ```
167
+
168
+ ## 🎓 Training Details
169
+
170
+ **Loss Function**: Contrastive Geometric Basin
171
+ - Primary: Maximize correct class compatibility, minimize incorrect
172
+ - Regularization: Cantor coherence, separation, discretization
173
+
174
+ **Optimization**:
175
+ - Optimizer: AdamW with separate learning rates
176
+ - Scales: {config.learning_rate}
177
+ - Fusion weights: {config.learning_rate * 0.5}
178
+ - Cantor stairs: {config.learning_rate * 0.1}
179
+ - Weight decay: {config.weight_decay}
180
+ - Gradient clipping: {config.gradient_clip}
181
+ - Scheduler: {config.scheduler_type}
182
+
183
+ **Data**:
184
+ - Dataset: ImageNet-1K CLIP features ({config.model_variant})
185
+ - Batch size: {config.batch_size}
186
+ - Training samples: 1,281,167
187
+ - Validation samples: 50,000
188
+
189
+ **Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
190
+
191
+ ## 🔑 Key Innovation
192
+
193
+ **No Cross-Entropy on Arbitrary Weights**
194
+
195
+ Traditional: `cross_entropy(W @ features + b, labels)`
196
+ - W and b are arbitrary learned parameters
197
+
198
+ **Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
199
+ - Compatibility from geometric structure:
200
+ - Feature ↔ Simplex centroid similarity
201
+ - Feature ↔ Cantor prototype coherence
202
+ - Feature ↔ Simplex vertex distance
203
+ - Cross-entropy applied to geometrically meaningful scores
204
+ - Structure enforced through geometric regularization
205
+
206
+ Result: Classification emerges from geometric organization, not arbitrary mappings.
207
+
208
+ ## 📊 Visualizations
209
+
210
+ The repository includes visualizations of learned structure:
211
+ - Cantor prototype distributions (histograms per scale)
212
+ - Sorted prototype curves (showing smooth manifold)
213
+ - Cross-scale analysis (mean, variance, geometric weights)
214
+
215
+ See `weights/{model_name}/{config.run_id}/` for generated plots.
216
+
217
+ ## 📁 Repository Structure
218
+
219
+ ```
220
+ weights/{model_name}/{config.run_id}/
221
+ ├── best_model_acc{best_acc:.2f}.safetensors # Model weights
222
+ ├── best_model_acc{best_acc:.2f}_metadata.json # Training metadata
223
+ ├── train_config.json # Training configuration
224
+ ├── training_history.json # Epoch-by-epoch history
225
+ ├── cantor_prototypes_distribution.png # Histogram analysis
226
+ ├── cantor_prototypes_sorted.png # Sorted manifold view
227
+ └── cantor_prototypes_cross_scale.png # Cross-scale comparison
228
+
229
+ runs/{model_name}/{config.run_id}/
230
+ └── events.out.tfevents.* # TensorBoard logs
231
+ ```
232
+
233
+ **Note**: Visualizations (*.png) are generated by running the probe script and should be
234
+ copied to the weights directory before uploading to Hub.
235
+
236
+ ## 🔬 Research
237
+
238
+ This architecture demonstrates:
239
+ 1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
240
+ 2. **Geometric organization** (classes spread smoothly in Cantor space)
241
+ 3. **Hierarchical strategy** (scales learn different geometric weightings)
242
+ 4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
243
+
244
+ The geometric constraints guide learning toward structured representations
245
+ without explicit supervision of the geometric components.
246
+
247
+ ## 📝 Citation
248
+
249
+ ```bibtex
250
+ @software{{geofractaldavid2025,
251
+ title = {{GeoFractalDavid: Geometric Basin Classification}},
252
+ author = {{AbstractPhil}},
253
+ year = {{2025}},
254
+ url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
255
+ note = {{Multi-scale geometric basin classifier with k-simplex structure}}
256
+ }}
257
+ ```
258
+
259
+ ## 📄 License
260
+
261
+ MIT License - See LICENSE file for details.
262
+
263
+ ---
264
+
265
+ *Model trained on {datetime.now().strftime('%Y-%m-%d')}*
266
+ *Run ID: {config.run_id}*
weights/GeoFractalDavid-Basin-k50/20251016_011725/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cedcff6edca1db61f3fd96a222f1c5a70ebac583e265e8259d41df418e8f797
3
+ size 777585564
weights/GeoFractalDavid-Basin-k50/20251016_011725/model_metadata.json ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1,
3
+ "metrics": {
4
+ "val_acc": 67.784,
5
+ "train_acc": 68.7058751903538,
6
+ "scale_accuracies": {
7
+ "448": 65.678,
8
+ "512": 65.72,
9
+ "576": 66.884,
10
+ "640": 65.488,
11
+ "704": 66.068,
12
+ "768": 65.25
13
+ },
14
+ "best_val_acc": 67.784,
15
+ "best_epoch": 1,
16
+ "final_train_acc": 68.7058751903538,
17
+ "training_time": "4m"
18
+ },
19
+ "config": {
20
+ "name": "geofractal_david_basin",
21
+ "run_id": "20251016_011725",
22
+ "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
23
+ "model_variant": "clip_vit_b32",
24
+ "num_classes": 1000,
25
+ "feature_dim": 512,
26
+ "scales": [
27
+ 448,
28
+ 512,
29
+ 576,
30
+ 640,
31
+ 704,
32
+ 768
33
+ ],
34
+ "k": 50,
35
+ "alpha_init": 0.25,
36
+ "tau": 0.25,
37
+ "w_coherence": 0.5,
38
+ "w_separation": 0.3,
39
+ "w_discretization": 0.05,
40
+ "w_geometry": 0.7,
41
+ "w_classification": 5.0,
42
+ "cantor_margin": 0.1,
43
+ "cantor_targets": [
44
+ 0.0,
45
+ 0.5,
46
+ 1.0
47
+ ],
48
+ "num_epochs": 10,
49
+ "batch_size": 1024,
50
+ "learning_rate": 0.001,
51
+ "weight_decay": 1e-05,
52
+ "warmup_epochs": 2,
53
+ "gradient_clip": 5.0,
54
+ "scheduler_type": "cosine",
55
+ "min_lr": 1e-06,
56
+ "log_interval": 50,
57
+ "val_interval": 1,
58
+ "save_interval": 5,
59
+ "base_dir": "./geofractal_training",
60
+ "num_workers": 6,
61
+ "pin_memory": true,
62
+ "prefetch_factor": 6,
63
+ "persistent_workers": true,
64
+ "hf_repo": "AbstractPhil/geofractal-david",
65
+ "upload_to_hub": true,
66
+ "private_repo": false,
67
+ "hub_upload_interval": 2
68
+ },
69
+ "diagnostics": {
70
+ "alpha_summary": {
71
+ "global": {
72
+ "initial": 0.3300742506980896,
73
+ "final": 0.33769452571868896,
74
+ "change": 0.007620275020599365,
75
+ "converged_to_0.5": false
76
+ }
77
+ },
78
+ "cantor_prototypes": {
79
+ "448": {
80
+ "final_mean": 0.3299235999584198,
81
+ "final_std": 0.11531054228544235,
82
+ "final_range": [
83
+ 0.06975235044956207,
84
+ 0.523155927658081
85
+ ]
86
+ },
87
+ "512": {
88
+ "final_mean": 0.33029788732528687,
89
+ "final_std": 0.11516479402780533,
90
+ "final_range": [
91
+ 0.07068338990211487,
92
+ 0.5231722593307495
93
+ ]
94
+ },
95
+ "576": {
96
+ "final_mean": 0.34062862396240234,
97
+ "final_std": 0.11377006024122238,
98
+ "final_range": [
99
+ 0.08460617810487747,
100
+ 0.5391716957092285
101
+ ]
102
+ },
103
+ "640": {
104
+ "final_mean": 0.3284243643283844,
105
+ "final_std": 0.11555633693933487,
106
+ "final_range": [
107
+ 0.06751251965761185,
108
+ 0.5210119485855103
109
+ ]
110
+ },
111
+ "704": {
112
+ "final_mean": 0.33759522438049316,
113
+ "final_std": 0.11413495987653732,
114
+ "final_range": [
115
+ 0.07985769212245941,
116
+ 0.5346474051475525
117
+ ]
118
+ },
119
+ "768": {
120
+ "final_mean": 0.3321439325809479,
121
+ "final_std": 0.11485133320093155,
122
+ "final_range": [
123
+ 0.072843037545681,
124
+ 0.5255964994430542
125
+ ]
126
+ }
127
+ },
128
+ "geo_weights": {
129
+ "448": {
130
+ "feature": 0.6526292562484741,
131
+ "cantor": 0.07099132984876633,
132
+ "crystal": 0.27637943625450134
133
+ },
134
+ "512": {
135
+ "feature": 0.6095101237297058,
136
+ "cantor": 0.0720025897026062,
137
+ "crystal": 0.318487286567688
138
+ },
139
+ "576": {
140
+ "feature": 0.8787516355514526,
141
+ "cantor": 0.02552814781665802,
142
+ "crystal": 0.09572020173072815
143
+ },
144
+ "640": {
145
+ "feature": 0.5784967541694641,
146
+ "cantor": 0.07067899405956268,
147
+ "crystal": 0.350824236869812
148
+ },
149
+ "704": {
150
+ "feature": 0.822432279586792,
151
+ "cantor": 0.029528409242630005,
152
+ "crystal": 0.14803928136825562
153
+ },
154
+ "768": {
155
+ "feature": 0.6678752899169922,
156
+ "cantor": 0.047526054084300995,
157
+ "crystal": 0.2845986485481262
158
+ }
159
+ },
160
+ "training_history": {
161
+ "epochs": [
162
+ 1,
163
+ 2
164
+ ],
165
+ "train_loss": [
166
+ 2.1792692034579693,
167
+ 1.7109403448363842
168
+ ],
169
+ "train_acc": [
170
+ 61.40464123724698,
171
+ 68.7058751903538
172
+ ],
173
+ "val_acc": [
174
+ 66.078,
175
+ 67.784
176
+ ],
177
+ "lr": [
178
+ 0.001,
179
+ 0.0009755527298894294
180
+ ]
181
+ }
182
+ }
183
+ }
weights/GeoFractalDavid-Basin-k50/20251016_011725/train_config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "geofractal_david_basin",
3
+ "run_id": "20251016_011725",
4
+ "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
5
+ "model_variant": "clip_vit_b32",
6
+ "num_classes": 1000,
7
+ "feature_dim": 512,
8
+ "scales": [
9
+ 448,
10
+ 512,
11
+ 576,
12
+ 640,
13
+ 704,
14
+ 768
15
+ ],
16
+ "k": 50,
17
+ "alpha_init": 0.25,
18
+ "tau": 0.25,
19
+ "w_coherence": 0.5,
20
+ "w_separation": 0.3,
21
+ "w_discretization": 0.05,
22
+ "w_geometry": 0.7,
23
+ "w_classification": 5.0,
24
+ "cantor_margin": 0.1,
25
+ "cantor_targets": [
26
+ 0.0,
27
+ 0.5,
28
+ 1.0
29
+ ],
30
+ "num_epochs": 10,
31
+ "batch_size": 1024,
32
+ "learning_rate": 0.001,
33
+ "weight_decay": 1e-05,
34
+ "warmup_epochs": 2,
35
+ "gradient_clip": 5.0,
36
+ "scheduler_type": "cosine",
37
+ "min_lr": 1e-06,
38
+ "log_interval": 50,
39
+ "val_interval": 1,
40
+ "save_interval": 5,
41
+ "base_dir": "./geofractal_training",
42
+ "num_workers": 6,
43
+ "pin_memory": true,
44
+ "prefetch_factor": 6,
45
+ "persistent_workers": true,
46
+ "hf_repo": "AbstractPhil/geofractal-david",
47
+ "upload_to_hub": true,
48
+ "private_repo": false,
49
+ "hub_upload_interval": 2
50
+ }
weights/GeoFractalDavid-Basin-k50/20251016_011725/training_history.json ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "training_history": {
3
+ "epochs": [
4
+ 1,
5
+ 2
6
+ ],
7
+ "train_loss": [
8
+ 2.1792692034579693,
9
+ 1.7109403448363842
10
+ ],
11
+ "train_acc": [
12
+ 61.40464123724698,
13
+ 68.7058751903538
14
+ ],
15
+ "val_acc": [
16
+ 66.078,
17
+ 67.784
18
+ ],
19
+ "lr": [
20
+ 0.001,
21
+ 0.0009755527298894294
22
+ ]
23
+ },
24
+ "loss_components": {
25
+ "contrastive": [
26
+ 2.079581160895741,
27
+ 1.636730343389054
28
+ ],
29
+ "correct": [
30
+ 0.6767644935522598,
31
+ 0.5404426808745716
32
+ ],
33
+ "incorrect": [
34
+ 0.45505849252969693,
35
+ 0.5255934803868635
36
+ ],
37
+ "contrast": [
38
+ 2.3505748449423063,
39
+ 1.6669818419998828
40
+ ],
41
+ "coherence": [
42
+ 0.17764737147389567,
43
+ 0.12312120711282118
44
+ ],
45
+ "separation": [
46
+ 0.01620986014311979,
47
+ 0.02391165155591096
48
+ ],
49
+ "discretization": [
50
+ 0.12002804970588934,
51
+ 0.10951803907101405
52
+ ],
53
+ "total": [
54
+ 2.1792692034579693,
55
+ 1.7109403448363842
56
+ ]
57
+ },
58
+ "scale_accuracies": {
59
+ "448": [
60
+ 65.17,
61
+ 65.678
62
+ ],
63
+ "512": [
64
+ 65.234,
65
+ 65.72
66
+ ],
67
+ "576": [
68
+ 65.116,
69
+ 66.884
70
+ ],
71
+ "640": [
72
+ 65.282,
73
+ 65.488
74
+ ],
75
+ "704": [
76
+ 64.986,
77
+ 66.068
78
+ ],
79
+ "768": [
80
+ 64.744,
81
+ 65.25
82
+ ]
83
+ },
84
+ "alpha_history": [
85
+ 0.3300742506980896,
86
+ 0.33769452571868896
87
+ ]
88
+ }