Upload weights - GeoFractalDavid-Basin-k50 - Run 20251016_011725 - Acc 67.78%
Browse files- weights/GeoFractalDavid-Basin-k50/20251016_011725/README.md +266 -0
- weights/GeoFractalDavid-Basin-k50/20251016_011725/model.safetensors +3 -0
- weights/GeoFractalDavid-Basin-k50/20251016_011725/model_metadata.json +183 -0
- weights/GeoFractalDavid-Basin-k50/20251016_011725/train_config.json +50 -0
- weights/GeoFractalDavid-Basin-k50/20251016_011725/training_history.json +88 -0
weights/GeoFractalDavid-Basin-k50/20251016_011725/README.md
ADDED
|
@@ -0,0 +1,266 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- image-classification
|
| 6 |
+
- imagenet
|
| 7 |
+
- geometric-basin
|
| 8 |
+
- cantor-coherence
|
| 9 |
+
- multi-scale
|
| 10 |
+
- geofractaldavid
|
| 11 |
+
datasets:
|
| 12 |
+
- imagenet-1k
|
| 13 |
+
metrics:
|
| 14 |
+
- accuracy
|
| 15 |
+
library_name: pytorch
|
| 16 |
+
model-index:
|
| 17 |
+
- name: GeoFractalDavid-Basin-k50
|
| 18 |
+
results:
|
| 19 |
+
- task:
|
| 20 |
+
type: image-classification
|
| 21 |
+
dataset:
|
| 22 |
+
name: ImageNet-1K
|
| 23 |
+
type: imagenet-1k
|
| 24 |
+
metrics:
|
| 25 |
+
- type: accuracy
|
| 26 |
+
value: 67.78
|
| 27 |
+
name: Validation Accuracy
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
# GeoFractalDavid-Basin-k50: Geometric Basin Classification
|
| 31 |
+
|
| 32 |
+
**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
|
| 33 |
+
Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
|
| 34 |
+
|
| 35 |
+
## 🎯 Performance
|
| 36 |
+
|
| 37 |
+
- **Best Validation Accuracy**: 67.78%
|
| 38 |
+
- **Epoch**: 2/10
|
| 39 |
+
- **Training Time**: 4m
|
| 40 |
+
|
| 41 |
+
### Per-Scale Performance
|
| 42 |
+
- **Scale 448D**: 65.68%
|
| 43 |
+
- **Scale 512D**: 65.72%
|
| 44 |
+
- **Scale 576D**: 66.88%
|
| 45 |
+
- **Scale 640D**: 65.49%
|
| 46 |
+
- **Scale 704D**: 66.07%
|
| 47 |
+
- **Scale 768D**: 65.25%
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
## 🏗️ Architecture
|
| 51 |
+
|
| 52 |
+
**Model Type**: Multi-scale geometric basin classifier
|
| 53 |
+
|
| 54 |
+
**Core Components**:
|
| 55 |
+
- **Feature Dimension**: 512
|
| 56 |
+
- **Number of Classes**: 1000
|
| 57 |
+
- **k-Simplex Structure**: k=50 (51 vertices per class)
|
| 58 |
+
- **Scales**: [448, 512, 576, 640, 704, 768]
|
| 59 |
+
- **Total Simplex Vertices**: 51,000
|
| 60 |
+
|
| 61 |
+
**Geometric Components**:
|
| 62 |
+
1. **Feature Similarity**: Cosine similarity to k-simplex centroids
|
| 63 |
+
2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
|
| 64 |
+
3. **Crystal Geometry**: Distance to nearest simplex vertex
|
| 65 |
+
|
| 66 |
+
Each scale learns to weight these components differently.
|
| 67 |
+
|
| 68 |
+
## 🔬 Learned Structure
|
| 69 |
+
|
| 70 |
+
### Alpha Convergence (Global Cantor Stairs)
|
| 71 |
+
|
| 72 |
+
The alpha parameter controls middle-interval weighting in the Cantor staircase.
|
| 73 |
+
|
| 74 |
+
- **Initial**: 0.3301
|
| 75 |
+
- **Final**: 0.3377
|
| 76 |
+
- **Change**: +0.0076
|
| 77 |
+
- **Converged to 0.5**: False
|
| 78 |
+
|
| 79 |
+
The Cantor staircase uses soft triadic decomposition with learnable alpha to map
|
| 80 |
+
features into [0,1] space with fractal structure.
|
| 81 |
+
|
| 82 |
+
### Cantor Prototype Distribution
|
| 83 |
+
|
| 84 |
+
Each class has a learned scalar Cantor prototype. The model pulls features toward
|
| 85 |
+
their class's Cantor position.
|
| 86 |
+
|
| 87 |
+
**Scale 448D**:
|
| 88 |
+
- Mean: 0.3299
|
| 89 |
+
- Std: 0.1153
|
| 90 |
+
- Range: [0.0698, 0.5232]
|
| 91 |
+
|
| 92 |
+
**Scale 512D**:
|
| 93 |
+
- Mean: 0.3303
|
| 94 |
+
- Std: 0.1152
|
| 95 |
+
- Range: [0.0707, 0.5232]
|
| 96 |
+
|
| 97 |
+
**Scale 576D**:
|
| 98 |
+
- Mean: 0.3406
|
| 99 |
+
- Std: 0.1138
|
| 100 |
+
- Range: [0.0846, 0.5392]
|
| 101 |
+
|
| 102 |
+
**Scale 640D**:
|
| 103 |
+
- Mean: 0.3284
|
| 104 |
+
- Std: 0.1156
|
| 105 |
+
- Range: [0.0675, 0.5210]
|
| 106 |
+
|
| 107 |
+
**Scale 704D**:
|
| 108 |
+
- Mean: 0.3376
|
| 109 |
+
- Std: 0.1141
|
| 110 |
+
- Range: [0.0799, 0.5346]
|
| 111 |
+
|
| 112 |
+
**Scale 768D**:
|
| 113 |
+
- Mean: 0.3321
|
| 114 |
+
- Std: 0.1149
|
| 115 |
+
- Range: [0.0728, 0.5256]
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
|
| 119 |
+
This creates a continuous manifold rather than discrete bins.
|
| 120 |
+
|
| 121 |
+
### Geometric Weight Evolution
|
| 122 |
+
|
| 123 |
+
Each scale learns optimal weights for combining geometric components:
|
| 124 |
+
|
| 125 |
+
**Scale 448D**: Feature=0.653, Cantor=0.071, Crystal=0.276
|
| 126 |
+
**Scale 512D**: Feature=0.610, Cantor=0.072, Crystal=0.318
|
| 127 |
+
**Scale 576D**: Feature=0.879, Cantor=0.026, Crystal=0.096
|
| 128 |
+
**Scale 640D**: Feature=0.578, Cantor=0.071, Crystal=0.351
|
| 129 |
+
**Scale 704D**: Feature=0.822, Cantor=0.030, Crystal=0.148
|
| 130 |
+
**Scale 768D**: Feature=0.668, Cantor=0.048, Crystal=0.285
|
| 131 |
+
|
| 132 |
+
|
| 133 |
+
**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
|
| 134 |
+
This hierarchical strategy emerges from training.
|
| 135 |
+
|
| 136 |
+
## 💻 Usage
|
| 137 |
+
|
| 138 |
+
```python
|
| 139 |
+
import torch
|
| 140 |
+
from safetensors.torch import load_file
|
| 141 |
+
from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
|
| 142 |
+
|
| 143 |
+
# Load model
|
| 144 |
+
model = GeoFractalDavid(
|
| 145 |
+
feature_dim=512,
|
| 146 |
+
num_classes=1000,
|
| 147 |
+
k=5,
|
| 148 |
+
scales=[256, 384, 512, 768, 1024, 1280],
|
| 149 |
+
alpha_init=0.5,
|
| 150 |
+
tau=0.25
|
| 151 |
+
)
|
| 152 |
+
|
| 153 |
+
state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
|
| 154 |
+
model.load_state_dict(state_dict)
|
| 155 |
+
model.eval()
|
| 156 |
+
|
| 157 |
+
# Inference
|
| 158 |
+
with torch.no_grad():
|
| 159 |
+
logits = model(features) # [batch_size, 1000]
|
| 160 |
+
predictions = logits.argmax(dim=-1)
|
| 161 |
+
|
| 162 |
+
# Inspect learned structure
|
| 163 |
+
print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
|
| 164 |
+
geo_weights = model.get_geometric_weights()
|
| 165 |
+
cantor_dist = model.get_cantor_interval_distribution(sample_features)
|
| 166 |
+
```
|
| 167 |
+
|
| 168 |
+
## 🎓 Training Details
|
| 169 |
+
|
| 170 |
+
**Loss Function**: Contrastive Geometric Basin
|
| 171 |
+
- Primary: Maximize correct class compatibility, minimize incorrect
|
| 172 |
+
- Regularization: Cantor coherence, separation, discretization
|
| 173 |
+
|
| 174 |
+
**Optimization**:
|
| 175 |
+
- Optimizer: AdamW with separate learning rates
|
| 176 |
+
- Scales: {config.learning_rate}
|
| 177 |
+
- Fusion weights: {config.learning_rate * 0.5}
|
| 178 |
+
- Cantor stairs: {config.learning_rate * 0.1}
|
| 179 |
+
- Weight decay: {config.weight_decay}
|
| 180 |
+
- Gradient clipping: {config.gradient_clip}
|
| 181 |
+
- Scheduler: {config.scheduler_type}
|
| 182 |
+
|
| 183 |
+
**Data**:
|
| 184 |
+
- Dataset: ImageNet-1K CLIP features ({config.model_variant})
|
| 185 |
+
- Batch size: {config.batch_size}
|
| 186 |
+
- Training samples: 1,281,167
|
| 187 |
+
- Validation samples: 50,000
|
| 188 |
+
|
| 189 |
+
**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
|
| 190 |
+
|
| 191 |
+
## 🔑 Key Innovation
|
| 192 |
+
|
| 193 |
+
**No Cross-Entropy on Arbitrary Weights**
|
| 194 |
+
|
| 195 |
+
Traditional: `cross_entropy(W @ features + b, labels)`
|
| 196 |
+
- W and b are arbitrary learned parameters
|
| 197 |
+
|
| 198 |
+
**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
|
| 199 |
+
- Compatibility from geometric structure:
|
| 200 |
+
- Feature ↔ Simplex centroid similarity
|
| 201 |
+
- Feature ↔ Cantor prototype coherence
|
| 202 |
+
- Feature ↔ Simplex vertex distance
|
| 203 |
+
- Cross-entropy applied to geometrically meaningful scores
|
| 204 |
+
- Structure enforced through geometric regularization
|
| 205 |
+
|
| 206 |
+
Result: Classification emerges from geometric organization, not arbitrary mappings.
|
| 207 |
+
|
| 208 |
+
## 📊 Visualizations
|
| 209 |
+
|
| 210 |
+
The repository includes visualizations of learned structure:
|
| 211 |
+
- Cantor prototype distributions (histograms per scale)
|
| 212 |
+
- Sorted prototype curves (showing smooth manifold)
|
| 213 |
+
- Cross-scale analysis (mean, variance, geometric weights)
|
| 214 |
+
|
| 215 |
+
See `weights/{model_name}/{config.run_id}/` for generated plots.
|
| 216 |
+
|
| 217 |
+
## 📁 Repository Structure
|
| 218 |
+
|
| 219 |
+
```
|
| 220 |
+
weights/{model_name}/{config.run_id}/
|
| 221 |
+
├── best_model_acc{best_acc:.2f}.safetensors # Model weights
|
| 222 |
+
├── best_model_acc{best_acc:.2f}_metadata.json # Training metadata
|
| 223 |
+
├── train_config.json # Training configuration
|
| 224 |
+
├── training_history.json # Epoch-by-epoch history
|
| 225 |
+
├── cantor_prototypes_distribution.png # Histogram analysis
|
| 226 |
+
├── cantor_prototypes_sorted.png # Sorted manifold view
|
| 227 |
+
└── cantor_prototypes_cross_scale.png # Cross-scale comparison
|
| 228 |
+
|
| 229 |
+
runs/{model_name}/{config.run_id}/
|
| 230 |
+
└── events.out.tfevents.* # TensorBoard logs
|
| 231 |
+
```
|
| 232 |
+
|
| 233 |
+
**Note**: Visualizations (*.png) are generated by running the probe script and should be
|
| 234 |
+
copied to the weights directory before uploading to Hub.
|
| 235 |
+
|
| 236 |
+
## 🔬 Research
|
| 237 |
+
|
| 238 |
+
This architecture demonstrates:
|
| 239 |
+
1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
|
| 240 |
+
2. **Geometric organization** (classes spread smoothly in Cantor space)
|
| 241 |
+
3. **Hierarchical strategy** (scales learn different geometric weightings)
|
| 242 |
+
4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
|
| 243 |
+
|
| 244 |
+
The geometric constraints guide learning toward structured representations
|
| 245 |
+
without explicit supervision of the geometric components.
|
| 246 |
+
|
| 247 |
+
## 📝 Citation
|
| 248 |
+
|
| 249 |
+
```bibtex
|
| 250 |
+
@software{{geofractaldavid2025,
|
| 251 |
+
title = {{GeoFractalDavid: Geometric Basin Classification}},
|
| 252 |
+
author = {{AbstractPhil}},
|
| 253 |
+
year = {{2025}},
|
| 254 |
+
url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
|
| 255 |
+
note = {{Multi-scale geometric basin classifier with k-simplex structure}}
|
| 256 |
+
}}
|
| 257 |
+
```
|
| 258 |
+
|
| 259 |
+
## 📄 License
|
| 260 |
+
|
| 261 |
+
MIT License - See LICENSE file for details.
|
| 262 |
+
|
| 263 |
+
---
|
| 264 |
+
|
| 265 |
+
*Model trained on {datetime.now().strftime('%Y-%m-%d')}*
|
| 266 |
+
*Run ID: {config.run_id}*
|
weights/GeoFractalDavid-Basin-k50/20251016_011725/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6cedcff6edca1db61f3fd96a222f1c5a70ebac583e265e8259d41df418e8f797
|
| 3 |
+
size 777585564
|
weights/GeoFractalDavid-Basin-k50/20251016_011725/model_metadata.json
ADDED
|
@@ -0,0 +1,183 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"epoch": 1,
|
| 3 |
+
"metrics": {
|
| 4 |
+
"val_acc": 67.784,
|
| 5 |
+
"train_acc": 68.7058751903538,
|
| 6 |
+
"scale_accuracies": {
|
| 7 |
+
"448": 65.678,
|
| 8 |
+
"512": 65.72,
|
| 9 |
+
"576": 66.884,
|
| 10 |
+
"640": 65.488,
|
| 11 |
+
"704": 66.068,
|
| 12 |
+
"768": 65.25
|
| 13 |
+
},
|
| 14 |
+
"best_val_acc": 67.784,
|
| 15 |
+
"best_epoch": 1,
|
| 16 |
+
"final_train_acc": 68.7058751903538,
|
| 17 |
+
"training_time": "4m"
|
| 18 |
+
},
|
| 19 |
+
"config": {
|
| 20 |
+
"name": "geofractal_david_basin",
|
| 21 |
+
"run_id": "20251016_011725",
|
| 22 |
+
"dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
|
| 23 |
+
"model_variant": "clip_vit_b32",
|
| 24 |
+
"num_classes": 1000,
|
| 25 |
+
"feature_dim": 512,
|
| 26 |
+
"scales": [
|
| 27 |
+
448,
|
| 28 |
+
512,
|
| 29 |
+
576,
|
| 30 |
+
640,
|
| 31 |
+
704,
|
| 32 |
+
768
|
| 33 |
+
],
|
| 34 |
+
"k": 50,
|
| 35 |
+
"alpha_init": 0.25,
|
| 36 |
+
"tau": 0.25,
|
| 37 |
+
"w_coherence": 0.5,
|
| 38 |
+
"w_separation": 0.3,
|
| 39 |
+
"w_discretization": 0.05,
|
| 40 |
+
"w_geometry": 0.7,
|
| 41 |
+
"w_classification": 5.0,
|
| 42 |
+
"cantor_margin": 0.1,
|
| 43 |
+
"cantor_targets": [
|
| 44 |
+
0.0,
|
| 45 |
+
0.5,
|
| 46 |
+
1.0
|
| 47 |
+
],
|
| 48 |
+
"num_epochs": 10,
|
| 49 |
+
"batch_size": 1024,
|
| 50 |
+
"learning_rate": 0.001,
|
| 51 |
+
"weight_decay": 1e-05,
|
| 52 |
+
"warmup_epochs": 2,
|
| 53 |
+
"gradient_clip": 5.0,
|
| 54 |
+
"scheduler_type": "cosine",
|
| 55 |
+
"min_lr": 1e-06,
|
| 56 |
+
"log_interval": 50,
|
| 57 |
+
"val_interval": 1,
|
| 58 |
+
"save_interval": 5,
|
| 59 |
+
"base_dir": "./geofractal_training",
|
| 60 |
+
"num_workers": 6,
|
| 61 |
+
"pin_memory": true,
|
| 62 |
+
"prefetch_factor": 6,
|
| 63 |
+
"persistent_workers": true,
|
| 64 |
+
"hf_repo": "AbstractPhil/geofractal-david",
|
| 65 |
+
"upload_to_hub": true,
|
| 66 |
+
"private_repo": false,
|
| 67 |
+
"hub_upload_interval": 2
|
| 68 |
+
},
|
| 69 |
+
"diagnostics": {
|
| 70 |
+
"alpha_summary": {
|
| 71 |
+
"global": {
|
| 72 |
+
"initial": 0.3300742506980896,
|
| 73 |
+
"final": 0.33769452571868896,
|
| 74 |
+
"change": 0.007620275020599365,
|
| 75 |
+
"converged_to_0.5": false
|
| 76 |
+
}
|
| 77 |
+
},
|
| 78 |
+
"cantor_prototypes": {
|
| 79 |
+
"448": {
|
| 80 |
+
"final_mean": 0.3299235999584198,
|
| 81 |
+
"final_std": 0.11531054228544235,
|
| 82 |
+
"final_range": [
|
| 83 |
+
0.06975235044956207,
|
| 84 |
+
0.523155927658081
|
| 85 |
+
]
|
| 86 |
+
},
|
| 87 |
+
"512": {
|
| 88 |
+
"final_mean": 0.33029788732528687,
|
| 89 |
+
"final_std": 0.11516479402780533,
|
| 90 |
+
"final_range": [
|
| 91 |
+
0.07068338990211487,
|
| 92 |
+
0.5231722593307495
|
| 93 |
+
]
|
| 94 |
+
},
|
| 95 |
+
"576": {
|
| 96 |
+
"final_mean": 0.34062862396240234,
|
| 97 |
+
"final_std": 0.11377006024122238,
|
| 98 |
+
"final_range": [
|
| 99 |
+
0.08460617810487747,
|
| 100 |
+
0.5391716957092285
|
| 101 |
+
]
|
| 102 |
+
},
|
| 103 |
+
"640": {
|
| 104 |
+
"final_mean": 0.3284243643283844,
|
| 105 |
+
"final_std": 0.11555633693933487,
|
| 106 |
+
"final_range": [
|
| 107 |
+
0.06751251965761185,
|
| 108 |
+
0.5210119485855103
|
| 109 |
+
]
|
| 110 |
+
},
|
| 111 |
+
"704": {
|
| 112 |
+
"final_mean": 0.33759522438049316,
|
| 113 |
+
"final_std": 0.11413495987653732,
|
| 114 |
+
"final_range": [
|
| 115 |
+
0.07985769212245941,
|
| 116 |
+
0.5346474051475525
|
| 117 |
+
]
|
| 118 |
+
},
|
| 119 |
+
"768": {
|
| 120 |
+
"final_mean": 0.3321439325809479,
|
| 121 |
+
"final_std": 0.11485133320093155,
|
| 122 |
+
"final_range": [
|
| 123 |
+
0.072843037545681,
|
| 124 |
+
0.5255964994430542
|
| 125 |
+
]
|
| 126 |
+
}
|
| 127 |
+
},
|
| 128 |
+
"geo_weights": {
|
| 129 |
+
"448": {
|
| 130 |
+
"feature": 0.6526292562484741,
|
| 131 |
+
"cantor": 0.07099132984876633,
|
| 132 |
+
"crystal": 0.27637943625450134
|
| 133 |
+
},
|
| 134 |
+
"512": {
|
| 135 |
+
"feature": 0.6095101237297058,
|
| 136 |
+
"cantor": 0.0720025897026062,
|
| 137 |
+
"crystal": 0.318487286567688
|
| 138 |
+
},
|
| 139 |
+
"576": {
|
| 140 |
+
"feature": 0.8787516355514526,
|
| 141 |
+
"cantor": 0.02552814781665802,
|
| 142 |
+
"crystal": 0.09572020173072815
|
| 143 |
+
},
|
| 144 |
+
"640": {
|
| 145 |
+
"feature": 0.5784967541694641,
|
| 146 |
+
"cantor": 0.07067899405956268,
|
| 147 |
+
"crystal": 0.350824236869812
|
| 148 |
+
},
|
| 149 |
+
"704": {
|
| 150 |
+
"feature": 0.822432279586792,
|
| 151 |
+
"cantor": 0.029528409242630005,
|
| 152 |
+
"crystal": 0.14803928136825562
|
| 153 |
+
},
|
| 154 |
+
"768": {
|
| 155 |
+
"feature": 0.6678752899169922,
|
| 156 |
+
"cantor": 0.047526054084300995,
|
| 157 |
+
"crystal": 0.2845986485481262
|
| 158 |
+
}
|
| 159 |
+
},
|
| 160 |
+
"training_history": {
|
| 161 |
+
"epochs": [
|
| 162 |
+
1,
|
| 163 |
+
2
|
| 164 |
+
],
|
| 165 |
+
"train_loss": [
|
| 166 |
+
2.1792692034579693,
|
| 167 |
+
1.7109403448363842
|
| 168 |
+
],
|
| 169 |
+
"train_acc": [
|
| 170 |
+
61.40464123724698,
|
| 171 |
+
68.7058751903538
|
| 172 |
+
],
|
| 173 |
+
"val_acc": [
|
| 174 |
+
66.078,
|
| 175 |
+
67.784
|
| 176 |
+
],
|
| 177 |
+
"lr": [
|
| 178 |
+
0.001,
|
| 179 |
+
0.0009755527298894294
|
| 180 |
+
]
|
| 181 |
+
}
|
| 182 |
+
}
|
| 183 |
+
}
|
weights/GeoFractalDavid-Basin-k50/20251016_011725/train_config.json
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"name": "geofractal_david_basin",
|
| 3 |
+
"run_id": "20251016_011725",
|
| 4 |
+
"dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
|
| 5 |
+
"model_variant": "clip_vit_b32",
|
| 6 |
+
"num_classes": 1000,
|
| 7 |
+
"feature_dim": 512,
|
| 8 |
+
"scales": [
|
| 9 |
+
448,
|
| 10 |
+
512,
|
| 11 |
+
576,
|
| 12 |
+
640,
|
| 13 |
+
704,
|
| 14 |
+
768
|
| 15 |
+
],
|
| 16 |
+
"k": 50,
|
| 17 |
+
"alpha_init": 0.25,
|
| 18 |
+
"tau": 0.25,
|
| 19 |
+
"w_coherence": 0.5,
|
| 20 |
+
"w_separation": 0.3,
|
| 21 |
+
"w_discretization": 0.05,
|
| 22 |
+
"w_geometry": 0.7,
|
| 23 |
+
"w_classification": 5.0,
|
| 24 |
+
"cantor_margin": 0.1,
|
| 25 |
+
"cantor_targets": [
|
| 26 |
+
0.0,
|
| 27 |
+
0.5,
|
| 28 |
+
1.0
|
| 29 |
+
],
|
| 30 |
+
"num_epochs": 10,
|
| 31 |
+
"batch_size": 1024,
|
| 32 |
+
"learning_rate": 0.001,
|
| 33 |
+
"weight_decay": 1e-05,
|
| 34 |
+
"warmup_epochs": 2,
|
| 35 |
+
"gradient_clip": 5.0,
|
| 36 |
+
"scheduler_type": "cosine",
|
| 37 |
+
"min_lr": 1e-06,
|
| 38 |
+
"log_interval": 50,
|
| 39 |
+
"val_interval": 1,
|
| 40 |
+
"save_interval": 5,
|
| 41 |
+
"base_dir": "./geofractal_training",
|
| 42 |
+
"num_workers": 6,
|
| 43 |
+
"pin_memory": true,
|
| 44 |
+
"prefetch_factor": 6,
|
| 45 |
+
"persistent_workers": true,
|
| 46 |
+
"hf_repo": "AbstractPhil/geofractal-david",
|
| 47 |
+
"upload_to_hub": true,
|
| 48 |
+
"private_repo": false,
|
| 49 |
+
"hub_upload_interval": 2
|
| 50 |
+
}
|
weights/GeoFractalDavid-Basin-k50/20251016_011725/training_history.json
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"training_history": {
|
| 3 |
+
"epochs": [
|
| 4 |
+
1,
|
| 5 |
+
2
|
| 6 |
+
],
|
| 7 |
+
"train_loss": [
|
| 8 |
+
2.1792692034579693,
|
| 9 |
+
1.7109403448363842
|
| 10 |
+
],
|
| 11 |
+
"train_acc": [
|
| 12 |
+
61.40464123724698,
|
| 13 |
+
68.7058751903538
|
| 14 |
+
],
|
| 15 |
+
"val_acc": [
|
| 16 |
+
66.078,
|
| 17 |
+
67.784
|
| 18 |
+
],
|
| 19 |
+
"lr": [
|
| 20 |
+
0.001,
|
| 21 |
+
0.0009755527298894294
|
| 22 |
+
]
|
| 23 |
+
},
|
| 24 |
+
"loss_components": {
|
| 25 |
+
"contrastive": [
|
| 26 |
+
2.079581160895741,
|
| 27 |
+
1.636730343389054
|
| 28 |
+
],
|
| 29 |
+
"correct": [
|
| 30 |
+
0.6767644935522598,
|
| 31 |
+
0.5404426808745716
|
| 32 |
+
],
|
| 33 |
+
"incorrect": [
|
| 34 |
+
0.45505849252969693,
|
| 35 |
+
0.5255934803868635
|
| 36 |
+
],
|
| 37 |
+
"contrast": [
|
| 38 |
+
2.3505748449423063,
|
| 39 |
+
1.6669818419998828
|
| 40 |
+
],
|
| 41 |
+
"coherence": [
|
| 42 |
+
0.17764737147389567,
|
| 43 |
+
0.12312120711282118
|
| 44 |
+
],
|
| 45 |
+
"separation": [
|
| 46 |
+
0.01620986014311979,
|
| 47 |
+
0.02391165155591096
|
| 48 |
+
],
|
| 49 |
+
"discretization": [
|
| 50 |
+
0.12002804970588934,
|
| 51 |
+
0.10951803907101405
|
| 52 |
+
],
|
| 53 |
+
"total": [
|
| 54 |
+
2.1792692034579693,
|
| 55 |
+
1.7109403448363842
|
| 56 |
+
]
|
| 57 |
+
},
|
| 58 |
+
"scale_accuracies": {
|
| 59 |
+
"448": [
|
| 60 |
+
65.17,
|
| 61 |
+
65.678
|
| 62 |
+
],
|
| 63 |
+
"512": [
|
| 64 |
+
65.234,
|
| 65 |
+
65.72
|
| 66 |
+
],
|
| 67 |
+
"576": [
|
| 68 |
+
65.116,
|
| 69 |
+
66.884
|
| 70 |
+
],
|
| 71 |
+
"640": [
|
| 72 |
+
65.282,
|
| 73 |
+
65.488
|
| 74 |
+
],
|
| 75 |
+
"704": [
|
| 76 |
+
64.986,
|
| 77 |
+
66.068
|
| 78 |
+
],
|
| 79 |
+
"768": [
|
| 80 |
+
64.744,
|
| 81 |
+
65.25
|
| 82 |
+
]
|
| 83 |
+
},
|
| 84 |
+
"alpha_history": [
|
| 85 |
+
0.3300742506980896,
|
| 86 |
+
0.33769452571868896
|
| 87 |
+
]
|
| 88 |
+
}
|