Upload weights - GeoFractalDavid-Basin-k12 - Run 20251016_020120 - Acc 67.69%
Browse files- weights/GeoFractalDavid-Basin-k12/20251016_020120/README.md +259 -0
- weights/GeoFractalDavid-Basin-k12/20251016_020120/model.safetensors +3 -0
- weights/GeoFractalDavid-Basin-k12/20251016_020120/model_metadata.json +168 -0
- weights/GeoFractalDavid-Basin-k12/20251016_020120/train_config.json +49 -0
- weights/GeoFractalDavid-Basin-k12/20251016_020120/training_history.json +84 -0
weights/GeoFractalDavid-Basin-k12/20251016_020120/README.md
ADDED
|
@@ -0,0 +1,259 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- image-classification
|
| 6 |
+
- imagenet
|
| 7 |
+
- geometric-basin
|
| 8 |
+
- cantor-coherence
|
| 9 |
+
- multi-scale
|
| 10 |
+
- geofractaldavid
|
| 11 |
+
datasets:
|
| 12 |
+
- imagenet-1k
|
| 13 |
+
metrics:
|
| 14 |
+
- accuracy
|
| 15 |
+
library_name: pytorch
|
| 16 |
+
model-index:
|
| 17 |
+
- name: GeoFractalDavid-Basin-k12
|
| 18 |
+
results:
|
| 19 |
+
- task:
|
| 20 |
+
type: image-classification
|
| 21 |
+
dataset:
|
| 22 |
+
name: ImageNet-1K
|
| 23 |
+
type: imagenet-1k
|
| 24 |
+
metrics:
|
| 25 |
+
- type: accuracy
|
| 26 |
+
value: 67.69
|
| 27 |
+
name: Validation Accuracy
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
# GeoFractalDavid-Basin-k12: Geometric Basin Classification
|
| 31 |
+
|
| 32 |
+
**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
|
| 33 |
+
Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
|
| 34 |
+
|
| 35 |
+
## 🎯 Performance
|
| 36 |
+
|
| 37 |
+
- **Best Validation Accuracy**: 67.69%
|
| 38 |
+
- **Epoch**: 2/10
|
| 39 |
+
- **Training Time**: 3m
|
| 40 |
+
|
| 41 |
+
### Per-Scale Performance
|
| 42 |
+
- **Scale 384D**: 66.16%
|
| 43 |
+
- **Scale 512D**: 66.40%
|
| 44 |
+
- **Scale 768D**: 67.01%
|
| 45 |
+
- **Scale 1024D**: 65.70%
|
| 46 |
+
- **Scale 1280D**: 61.63%
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
## 🏗️ Architecture
|
| 50 |
+
|
| 51 |
+
**Model Type**: Multi-scale geometric basin classifier
|
| 52 |
+
|
| 53 |
+
**Core Components**:
|
| 54 |
+
- **Feature Dimension**: 512
|
| 55 |
+
- **Number of Classes**: 1000
|
| 56 |
+
- **k-Simplex Structure**: k=12 (13 vertices per class)
|
| 57 |
+
- **Scales**: [384, 512, 768, 1024, 1280]
|
| 58 |
+
- **Total Simplex Vertices**: 13,000
|
| 59 |
+
|
| 60 |
+
**Geometric Components**:
|
| 61 |
+
1. **Feature Similarity**: Cosine similarity to k-simplex centroids
|
| 62 |
+
2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
|
| 63 |
+
3. **Crystal Geometry**: Distance to nearest simplex vertex
|
| 64 |
+
|
| 65 |
+
Each scale learns to weight these components differently.
|
| 66 |
+
|
| 67 |
+
## 🔬 Learned Structure
|
| 68 |
+
|
| 69 |
+
### Alpha Convergence (Global Cantor Stairs)
|
| 70 |
+
|
| 71 |
+
The alpha parameter controls middle-interval weighting in the Cantor staircase.
|
| 72 |
+
|
| 73 |
+
- **Initial**: 0.3290
|
| 74 |
+
- **Final**: 0.3158
|
| 75 |
+
- **Change**: -0.0132
|
| 76 |
+
- **Converged to 0.5**: False
|
| 77 |
+
|
| 78 |
+
The Cantor staircase uses soft triadic decomposition with learnable alpha to map
|
| 79 |
+
features into [0,1] space with fractal structure.
|
| 80 |
+
|
| 81 |
+
### Cantor Prototype Distribution
|
| 82 |
+
|
| 83 |
+
Each class has a learned scalar Cantor prototype. The model pulls features toward
|
| 84 |
+
their class's Cantor position.
|
| 85 |
+
|
| 86 |
+
**Scale 384D**:
|
| 87 |
+
- Mean: 0.2949
|
| 88 |
+
- Std: 0.1159
|
| 89 |
+
- Range: [0.0695, 0.4995]
|
| 90 |
+
|
| 91 |
+
**Scale 512D**:
|
| 92 |
+
- Mean: 0.2942
|
| 93 |
+
- Std: 0.1160
|
| 94 |
+
- Range: [0.0690, 0.4994]
|
| 95 |
+
|
| 96 |
+
**Scale 768D**:
|
| 97 |
+
- Mean: 0.3039
|
| 98 |
+
- Std: 0.1147
|
| 99 |
+
- Range: [0.0746, 0.5010]
|
| 100 |
+
|
| 101 |
+
**Scale 1024D**:
|
| 102 |
+
- Mean: 0.2993
|
| 103 |
+
- Std: 0.1153
|
| 104 |
+
- Range: [0.0727, 0.4998]
|
| 105 |
+
|
| 106 |
+
**Scale 1280D**:
|
| 107 |
+
- Mean: 0.2973
|
| 108 |
+
- Std: 0.1156
|
| 109 |
+
- Range: [0.0710, 0.4997]
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
|
| 113 |
+
This creates a continuous manifold rather than discrete bins.
|
| 114 |
+
|
| 115 |
+
### Geometric Weight Evolution
|
| 116 |
+
|
| 117 |
+
Each scale learns optimal weights for combining geometric components:
|
| 118 |
+
|
| 119 |
+
**Scale 384D**: Feature=0.765, Cantor=0.070, Crystal=0.165
|
| 120 |
+
**Scale 512D**: Feature=0.717, Cantor=0.072, Crystal=0.211
|
| 121 |
+
**Scale 768D**: Feature=0.866, Cantor=0.030, Crystal=0.104
|
| 122 |
+
**Scale 1024D**: Feature=0.744, Cantor=0.041, Crystal=0.215
|
| 123 |
+
**Scale 1280D**: Feature=0.661, Cantor=0.042, Crystal=0.298
|
| 124 |
+
|
| 125 |
+
|
| 126 |
+
**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
|
| 127 |
+
This hierarchical strategy emerges from training.
|
| 128 |
+
|
| 129 |
+
## 💻 Usage
|
| 130 |
+
|
| 131 |
+
```python
|
| 132 |
+
import torch
|
| 133 |
+
from safetensors.torch import load_file
|
| 134 |
+
from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
|
| 135 |
+
|
| 136 |
+
# Load model
|
| 137 |
+
model = GeoFractalDavid(
|
| 138 |
+
feature_dim=512,
|
| 139 |
+
num_classes=1000,
|
| 140 |
+
k=5,
|
| 141 |
+
scales=[256, 384, 512, 768, 1024, 1280],
|
| 142 |
+
alpha_init=0.5,
|
| 143 |
+
tau=0.25
|
| 144 |
+
)
|
| 145 |
+
|
| 146 |
+
state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
|
| 147 |
+
model.load_state_dict(state_dict)
|
| 148 |
+
model.eval()
|
| 149 |
+
|
| 150 |
+
# Inference
|
| 151 |
+
with torch.no_grad():
|
| 152 |
+
logits = model(features) # [batch_size, 1000]
|
| 153 |
+
predictions = logits.argmax(dim=-1)
|
| 154 |
+
|
| 155 |
+
# Inspect learned structure
|
| 156 |
+
print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
|
| 157 |
+
geo_weights = model.get_geometric_weights()
|
| 158 |
+
cantor_dist = model.get_cantor_interval_distribution(sample_features)
|
| 159 |
+
```
|
| 160 |
+
|
| 161 |
+
## 🎓 Training Details
|
| 162 |
+
|
| 163 |
+
**Loss Function**: Contrastive Geometric Basin
|
| 164 |
+
- Primary: Maximize correct class compatibility, minimize incorrect
|
| 165 |
+
- Regularization: Cantor coherence, separation, discretization
|
| 166 |
+
|
| 167 |
+
**Optimization**:
|
| 168 |
+
- Optimizer: AdamW with separate learning rates
|
| 169 |
+
- Scales: {config.learning_rate}
|
| 170 |
+
- Fusion weights: {config.learning_rate * 0.5}
|
| 171 |
+
- Cantor stairs: {config.learning_rate * 0.1}
|
| 172 |
+
- Weight decay: {config.weight_decay}
|
| 173 |
+
- Gradient clipping: {config.gradient_clip}
|
| 174 |
+
- Scheduler: {config.scheduler_type}
|
| 175 |
+
|
| 176 |
+
**Data**:
|
| 177 |
+
- Dataset: ImageNet-1K CLIP features ({config.model_variant})
|
| 178 |
+
- Batch size: {config.batch_size}
|
| 179 |
+
- Training samples: 1,281,167
|
| 180 |
+
- Validation samples: 50,000
|
| 181 |
+
|
| 182 |
+
**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
|
| 183 |
+
|
| 184 |
+
## 🔑 Key Innovation
|
| 185 |
+
|
| 186 |
+
**No Cross-Entropy on Arbitrary Weights**
|
| 187 |
+
|
| 188 |
+
Traditional: `cross_entropy(W @ features + b, labels)`
|
| 189 |
+
- W and b are arbitrary learned parameters
|
| 190 |
+
|
| 191 |
+
**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
|
| 192 |
+
- Compatibility from geometric structure:
|
| 193 |
+
- Feature ↔ Simplex centroid similarity
|
| 194 |
+
- Feature ↔ Cantor prototype coherence
|
| 195 |
+
- Feature ↔ Simplex vertex distance
|
| 196 |
+
- Cross-entropy applied to geometrically meaningful scores
|
| 197 |
+
- Structure enforced through geometric regularization
|
| 198 |
+
|
| 199 |
+
Result: Classification emerges from geometric organization, not arbitrary mappings.
|
| 200 |
+
|
| 201 |
+
## 📊 Visualizations
|
| 202 |
+
|
| 203 |
+
The repository includes visualizations of learned structure:
|
| 204 |
+
- Cantor prototype distributions (histograms per scale)
|
| 205 |
+
- Sorted prototype curves (showing smooth manifold)
|
| 206 |
+
- Cross-scale analysis (mean, variance, geometric weights)
|
| 207 |
+
|
| 208 |
+
See `weights/{model_name}/{config.run_id}/` for generated plots.
|
| 209 |
+
|
| 210 |
+
## 📁 Repository Structure
|
| 211 |
+
|
| 212 |
+
```
|
| 213 |
+
weights/{model_name}/{config.run_id}/
|
| 214 |
+
├── best_model_acc{best_acc:.2f}.safetensors # Model weights
|
| 215 |
+
├── best_model_acc{best_acc:.2f}_metadata.json # Training metadata
|
| 216 |
+
├── train_config.json # Training configuration
|
| 217 |
+
├── training_history.json # Epoch-by-epoch history
|
| 218 |
+
├── cantor_prototypes_distribution.png # Histogram analysis
|
| 219 |
+
├── cantor_prototypes_sorted.png # Sorted manifold view
|
| 220 |
+
└── cantor_prototypes_cross_scale.png # Cross-scale comparison
|
| 221 |
+
|
| 222 |
+
runs/{model_name}/{config.run_id}/
|
| 223 |
+
└── events.out.tfevents.* # TensorBoard logs
|
| 224 |
+
```
|
| 225 |
+
|
| 226 |
+
**Note**: Visualizations (*.png) are generated by running the probe script and should be
|
| 227 |
+
copied to the weights directory before uploading to Hub.
|
| 228 |
+
|
| 229 |
+
## 🔬 Research
|
| 230 |
+
|
| 231 |
+
This architecture demonstrates:
|
| 232 |
+
1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
|
| 233 |
+
2. **Geometric organization** (classes spread smoothly in Cantor space)
|
| 234 |
+
3. **Hierarchical strategy** (scales learn different geometric weightings)
|
| 235 |
+
4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
|
| 236 |
+
|
| 237 |
+
The geometric constraints guide learning toward structured representations
|
| 238 |
+
without explicit supervision of the geometric components.
|
| 239 |
+
|
| 240 |
+
## 📝 Citation
|
| 241 |
+
|
| 242 |
+
```bibtex
|
| 243 |
+
@software{{geofractaldavid2025,
|
| 244 |
+
title = {{GeoFractalDavid: Geometric Basin Classification}},
|
| 245 |
+
author = {{AbstractPhil}},
|
| 246 |
+
year = {{2025}},
|
| 247 |
+
url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
|
| 248 |
+
note = {{Multi-scale geometric basin classifier with k-simplex structure}}
|
| 249 |
+
}}
|
| 250 |
+
```
|
| 251 |
+
|
| 252 |
+
## 📄 License
|
| 253 |
+
|
| 254 |
+
MIT License - See LICENSE file for details.
|
| 255 |
+
|
| 256 |
+
---
|
| 257 |
+
|
| 258 |
+
*Model trained on {datetime.now().strftime('%Y-%m-%d')}*
|
| 259 |
+
*Run ID: {config.run_id}*
|
weights/GeoFractalDavid-Basin-k12/20251016_020120/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:19ea4513768845651cd40abf613e51133125843b7f714e3c05f96436481609fc
|
| 3 |
+
size 252218244
|
weights/GeoFractalDavid-Basin-k12/20251016_020120/model_metadata.json
ADDED
|
@@ -0,0 +1,168 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"epoch": 1,
|
| 3 |
+
"metrics": {
|
| 4 |
+
"val_acc": 67.692,
|
| 5 |
+
"train_acc": 68.72242260376672,
|
| 6 |
+
"scale_accuracies": {
|
| 7 |
+
"384": 66.158,
|
| 8 |
+
"512": 66.398,
|
| 9 |
+
"768": 67.006,
|
| 10 |
+
"1024": 65.698,
|
| 11 |
+
"1280": 61.634
|
| 12 |
+
},
|
| 13 |
+
"best_val_acc": 67.692,
|
| 14 |
+
"best_epoch": 1,
|
| 15 |
+
"final_train_acc": 68.72242260376672,
|
| 16 |
+
"training_time": "3m"
|
| 17 |
+
},
|
| 18 |
+
"config": {
|
| 19 |
+
"name": "geofractal_david_basin",
|
| 20 |
+
"run_id": "20251016_020120",
|
| 21 |
+
"dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
|
| 22 |
+
"model_variant": "clip_vit_b32",
|
| 23 |
+
"num_classes": 1000,
|
| 24 |
+
"feature_dim": 512,
|
| 25 |
+
"scales": [
|
| 26 |
+
384,
|
| 27 |
+
512,
|
| 28 |
+
768,
|
| 29 |
+
1024,
|
| 30 |
+
1280
|
| 31 |
+
],
|
| 32 |
+
"k": 12,
|
| 33 |
+
"alpha_init": 0.25,
|
| 34 |
+
"tau": 0.25,
|
| 35 |
+
"w_coherence": 0.6,
|
| 36 |
+
"w_separation": 0.35,
|
| 37 |
+
"w_discretization": 0.05,
|
| 38 |
+
"w_geometry": 0.7,
|
| 39 |
+
"w_classification": 5.0,
|
| 40 |
+
"cantor_margin": 0.1,
|
| 41 |
+
"cantor_targets": [
|
| 42 |
+
0.0,
|
| 43 |
+
0.5,
|
| 44 |
+
1.0
|
| 45 |
+
],
|
| 46 |
+
"num_epochs": 10,
|
| 47 |
+
"batch_size": 1024,
|
| 48 |
+
"learning_rate": 0.001,
|
| 49 |
+
"weight_decay": 1e-05,
|
| 50 |
+
"warmup_epochs": 2,
|
| 51 |
+
"gradient_clip": 5.0,
|
| 52 |
+
"scheduler_type": "cosine",
|
| 53 |
+
"min_lr": 1e-06,
|
| 54 |
+
"log_interval": 50,
|
| 55 |
+
"val_interval": 1,
|
| 56 |
+
"save_interval": 5,
|
| 57 |
+
"base_dir": "./geofractal_training",
|
| 58 |
+
"num_workers": 6,
|
| 59 |
+
"pin_memory": true,
|
| 60 |
+
"prefetch_factor": 6,
|
| 61 |
+
"persistent_workers": true,
|
| 62 |
+
"hf_repo": "AbstractPhil/geofractal-david",
|
| 63 |
+
"upload_to_hub": true,
|
| 64 |
+
"private_repo": false,
|
| 65 |
+
"hub_upload_interval": 2
|
| 66 |
+
},
|
| 67 |
+
"diagnostics": {
|
| 68 |
+
"alpha_summary": {
|
| 69 |
+
"global": {
|
| 70 |
+
"initial": 0.329033762216568,
|
| 71 |
+
"final": 0.31584006547927856,
|
| 72 |
+
"change": -0.013193696737289429,
|
| 73 |
+
"converged_to_0.5": false
|
| 74 |
+
}
|
| 75 |
+
},
|
| 76 |
+
"cantor_prototypes": {
|
| 77 |
+
"384": {
|
| 78 |
+
"final_mean": 0.2949255406856537,
|
| 79 |
+
"final_std": 0.11593744903802872,
|
| 80 |
+
"final_range": [
|
| 81 |
+
0.06953522562980652,
|
| 82 |
+
0.4994584918022156
|
| 83 |
+
]
|
| 84 |
+
},
|
| 85 |
+
"512": {
|
| 86 |
+
"final_mean": 0.2941948473453522,
|
| 87 |
+
"final_std": 0.11603859812021255,
|
| 88 |
+
"final_range": [
|
| 89 |
+
0.068989597260952,
|
| 90 |
+
0.4994165003299713
|
| 91 |
+
]
|
| 92 |
+
},
|
| 93 |
+
"768": {
|
| 94 |
+
"final_mean": 0.3039235472679138,
|
| 95 |
+
"final_std": 0.11473686248064041,
|
| 96 |
+
"final_range": [
|
| 97 |
+
0.07456477731466293,
|
| 98 |
+
0.5010157823562622
|
| 99 |
+
]
|
| 100 |
+
},
|
| 101 |
+
"1024": {
|
| 102 |
+
"final_mean": 0.2993130087852478,
|
| 103 |
+
"final_std": 0.11533719301223755,
|
| 104 |
+
"final_range": [
|
| 105 |
+
0.07265479117631912,
|
| 106 |
+
0.49978458881378174
|
| 107 |
+
]
|
| 108 |
+
},
|
| 109 |
+
"1280": {
|
| 110 |
+
"final_mean": 0.2972503900527954,
|
| 111 |
+
"final_std": 0.11560141295194626,
|
| 112 |
+
"final_range": [
|
| 113 |
+
0.07099252939224243,
|
| 114 |
+
0.49970734119415283
|
| 115 |
+
]
|
| 116 |
+
}
|
| 117 |
+
},
|
| 118 |
+
"geo_weights": {
|
| 119 |
+
"384": {
|
| 120 |
+
"feature": 0.7648592591285706,
|
| 121 |
+
"cantor": 0.07024633139371872,
|
| 122 |
+
"crystal": 0.16489434242248535
|
| 123 |
+
},
|
| 124 |
+
"512": {
|
| 125 |
+
"feature": 0.7165910005569458,
|
| 126 |
+
"cantor": 0.07196629792451859,
|
| 127 |
+
"crystal": 0.21144266426563263
|
| 128 |
+
},
|
| 129 |
+
"768": {
|
| 130 |
+
"feature": 0.86624675989151,
|
| 131 |
+
"cantor": 0.02973315492272377,
|
| 132 |
+
"crystal": 0.10402002185583115
|
| 133 |
+
},
|
| 134 |
+
"1024": {
|
| 135 |
+
"feature": 0.744249165058136,
|
| 136 |
+
"cantor": 0.04059043526649475,
|
| 137 |
+
"crystal": 0.21516045928001404
|
| 138 |
+
},
|
| 139 |
+
"1280": {
|
| 140 |
+
"feature": 0.6605481505393982,
|
| 141 |
+
"cantor": 0.0416710264980793,
|
| 142 |
+
"crystal": 0.297780841588974
|
| 143 |
+
}
|
| 144 |
+
},
|
| 145 |
+
"training_history": {
|
| 146 |
+
"epochs": [
|
| 147 |
+
1,
|
| 148 |
+
2
|
| 149 |
+
],
|
| 150 |
+
"train_loss": [
|
| 151 |
+
2.19502723750215,
|
| 152 |
+
1.7239762367531895
|
| 153 |
+
],
|
| 154 |
+
"train_acc": [
|
| 155 |
+
61.41049527501099,
|
| 156 |
+
68.72242260376672
|
| 157 |
+
],
|
| 158 |
+
"val_acc": [
|
| 159 |
+
66.01,
|
| 160 |
+
67.692
|
| 161 |
+
],
|
| 162 |
+
"lr": [
|
| 163 |
+
0.001,
|
| 164 |
+
0.0009755527298894294
|
| 165 |
+
]
|
| 166 |
+
}
|
| 167 |
+
}
|
| 168 |
+
}
|
weights/GeoFractalDavid-Basin-k12/20251016_020120/train_config.json
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"name": "geofractal_david_basin",
|
| 3 |
+
"run_id": "20251016_020120",
|
| 4 |
+
"dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
|
| 5 |
+
"model_variant": "clip_vit_b32",
|
| 6 |
+
"num_classes": 1000,
|
| 7 |
+
"feature_dim": 512,
|
| 8 |
+
"scales": [
|
| 9 |
+
384,
|
| 10 |
+
512,
|
| 11 |
+
768,
|
| 12 |
+
1024,
|
| 13 |
+
1280
|
| 14 |
+
],
|
| 15 |
+
"k": 12,
|
| 16 |
+
"alpha_init": 0.25,
|
| 17 |
+
"tau": 0.25,
|
| 18 |
+
"w_coherence": 0.6,
|
| 19 |
+
"w_separation": 0.35,
|
| 20 |
+
"w_discretization": 0.05,
|
| 21 |
+
"w_geometry": 0.7,
|
| 22 |
+
"w_classification": 5.0,
|
| 23 |
+
"cantor_margin": 0.1,
|
| 24 |
+
"cantor_targets": [
|
| 25 |
+
0.0,
|
| 26 |
+
0.5,
|
| 27 |
+
1.0
|
| 28 |
+
],
|
| 29 |
+
"num_epochs": 10,
|
| 30 |
+
"batch_size": 1024,
|
| 31 |
+
"learning_rate": 0.001,
|
| 32 |
+
"weight_decay": 1e-05,
|
| 33 |
+
"warmup_epochs": 2,
|
| 34 |
+
"gradient_clip": 5.0,
|
| 35 |
+
"scheduler_type": "cosine",
|
| 36 |
+
"min_lr": 1e-06,
|
| 37 |
+
"log_interval": 50,
|
| 38 |
+
"val_interval": 1,
|
| 39 |
+
"save_interval": 5,
|
| 40 |
+
"base_dir": "./geofractal_training",
|
| 41 |
+
"num_workers": 6,
|
| 42 |
+
"pin_memory": true,
|
| 43 |
+
"prefetch_factor": 6,
|
| 44 |
+
"persistent_workers": true,
|
| 45 |
+
"hf_repo": "AbstractPhil/geofractal-david",
|
| 46 |
+
"upload_to_hub": true,
|
| 47 |
+
"private_repo": false,
|
| 48 |
+
"hub_upload_interval": 2
|
| 49 |
+
}
|
weights/GeoFractalDavid-Basin-k12/20251016_020120/training_history.json
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"training_history": {
|
| 3 |
+
"epochs": [
|
| 4 |
+
1,
|
| 5 |
+
2
|
| 6 |
+
],
|
| 7 |
+
"train_loss": [
|
| 8 |
+
2.19502723750215,
|
| 9 |
+
1.7239762367531895
|
| 10 |
+
],
|
| 11 |
+
"train_acc": [
|
| 12 |
+
61.41049527501099,
|
| 13 |
+
68.72242260376672
|
| 14 |
+
],
|
| 15 |
+
"val_acc": [
|
| 16 |
+
66.01,
|
| 17 |
+
67.692
|
| 18 |
+
],
|
| 19 |
+
"lr": [
|
| 20 |
+
0.001,
|
| 21 |
+
0.0009755527298894294
|
| 22 |
+
]
|
| 23 |
+
},
|
| 24 |
+
"loss_components": {
|
| 25 |
+
"contrastive": [
|
| 26 |
+
2.078349736647103,
|
| 27 |
+
1.6415592758609845
|
| 28 |
+
],
|
| 29 |
+
"correct": [
|
| 30 |
+
0.672378426352248,
|
| 31 |
+
0.5540952974329361
|
| 32 |
+
],
|
| 33 |
+
"incorrect": [
|
| 34 |
+
0.4581509334639238,
|
| 35 |
+
0.5116512704485903
|
| 36 |
+
],
|
| 37 |
+
"contrast": [
|
| 38 |
+
2.3537916830553414,
|
| 39 |
+
1.663276688359416
|
| 40 |
+
],
|
| 41 |
+
"coherence": [
|
| 42 |
+
0.17484173516210294,
|
| 43 |
+
0.11351838842534233
|
| 44 |
+
],
|
| 45 |
+
"separation": [
|
| 46 |
+
0.01637800338190081,
|
| 47 |
+
0.023601053461741926
|
| 48 |
+
],
|
| 49 |
+
"discretization": [
|
| 50 |
+
0.1208030286117103,
|
| 51 |
+
0.1209111282417474
|
| 52 |
+
],
|
| 53 |
+
"total": [
|
| 54 |
+
2.19502723750215,
|
| 55 |
+
1.7239762367531895
|
| 56 |
+
]
|
| 57 |
+
},
|
| 58 |
+
"scale_accuracies": {
|
| 59 |
+
"384": [
|
| 60 |
+
65.416,
|
| 61 |
+
66.158
|
| 62 |
+
],
|
| 63 |
+
"512": [
|
| 64 |
+
65.604,
|
| 65 |
+
66.398
|
| 66 |
+
],
|
| 67 |
+
"768": [
|
| 68 |
+
65.222,
|
| 69 |
+
67.006
|
| 70 |
+
],
|
| 71 |
+
"1024": [
|
| 72 |
+
64.774,
|
| 73 |
+
65.698
|
| 74 |
+
],
|
| 75 |
+
"1280": [
|
| 76 |
+
63.768,
|
| 77 |
+
61.634
|
| 78 |
+
]
|
| 79 |
+
},
|
| 80 |
+
"alpha_history": [
|
| 81 |
+
0.329033762216568,
|
| 82 |
+
0.31584006547927856
|
| 83 |
+
]
|
| 84 |
+
}
|