Upload weights - GeoFractalDavid-Basin-k12 - Run 20251016_020120 - Acc 67.69%

Browse files

Files changed (5) hide show

weights/GeoFractalDavid-Basin-k12/20251016_020120/README.md +259 -0
weights/GeoFractalDavid-Basin-k12/20251016_020120/model.safetensors +3 -0
weights/GeoFractalDavid-Basin-k12/20251016_020120/model_metadata.json +168 -0
weights/GeoFractalDavid-Basin-k12/20251016_020120/train_config.json +49 -0
weights/GeoFractalDavid-Basin-k12/20251016_020120/training_history.json +84 -0

weights/GeoFractalDavid-Basin-k12/20251016_020120/README.md ADDED Viewed

	@@ -0,0 +1,259 @@

+---
+language: en
+license: mit
+tags:
+- image-classification
+- imagenet
+- geometric-basin
+- cantor-coherence
+- multi-scale
+- geofractaldavid
+datasets:
+- imagenet-1k
+metrics:
+- accuracy
+library_name: pytorch
+model-index:
+- name: GeoFractalDavid-Basin-k12
+  results:
+  - task:
+      type: image-classification
+    dataset:
+      name: ImageNet-1K
+      type: imagenet-1k
+    metrics:
+    - type: accuracy
+      value: 67.69
+      name: Validation Accuracy
+---
+# GeoFractalDavid-Basin-k12: Geometric Basin Classification
+**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
+Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
+## 🎯 Performance
+- **Best Validation Accuracy**: 67.69%
+- **Epoch**: 2/10
+- **Training Time**: 3m
+### Per-Scale Performance
+- **Scale 384D**: 66.16%
+- **Scale 512D**: 66.40%
+- **Scale 768D**: 67.01%
+- **Scale 1024D**: 65.70%
+- **Scale 1280D**: 61.63%
+## 🏗️ Architecture
+**Model Type**: Multi-scale geometric basin classifier
+**Core Components**:
+- **Feature Dimension**: 512
+- **Number of Classes**: 1000
+- **k-Simplex Structure**: k=12 (13 vertices per class)
+- **Scales**: [384, 512, 768, 1024, 1280]
+- **Total Simplex Vertices**: 13,000
+**Geometric Components**:
+1. **Feature Similarity**: Cosine similarity to k-simplex centroids
+2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
+3. **Crystal Geometry**: Distance to nearest simplex vertex
+Each scale learns to weight these components differently.
+## 🔬 Learned Structure
+### Alpha Convergence (Global Cantor Stairs)
+The alpha parameter controls middle-interval weighting in the Cantor staircase.
+- **Initial**: 0.3290
+- **Final**: 0.3158
+- **Change**: -0.0132
+- **Converged to 0.5**: False
+The Cantor staircase uses soft triadic decomposition with learnable alpha to map
+features into [0,1] space with fractal structure.
+### Cantor Prototype Distribution
+Each class has a learned scalar Cantor prototype. The model pulls features toward
+their class's Cantor position.
+**Scale 384D**:
+- Mean: 0.2949
+- Std: 0.1159
+- Range: [0.0695, 0.4995]
+**Scale 512D**:
+- Mean: 0.2942
+- Std: 0.1160
+- Range: [0.0690, 0.4994]
+**Scale 768D**:
+- Mean: 0.3039
+- Std: 0.1147
+- Range: [0.0746, 0.5010]
+**Scale 1024D**:
+- Mean: 0.2993
+- Std: 0.1153
+- Range: [0.0727, 0.4998]
+**Scale 1280D**:
+- Mean: 0.2973
+- Std: 0.1156
+- Range: [0.0710, 0.4997]
+Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
+This creates a continuous manifold rather than discrete bins.
+### Geometric Weight Evolution
+Each scale learns optimal weights for combining geometric components:
+**Scale 384D**: Feature=0.765, Cantor=0.070, Crystal=0.165
+**Scale 512D**: Feature=0.717, Cantor=0.072, Crystal=0.211
+**Scale 768D**: Feature=0.866, Cantor=0.030, Crystal=0.104
+**Scale 1024D**: Feature=0.744, Cantor=0.041, Crystal=0.215
+**Scale 1280D**: Feature=0.661, Cantor=0.042, Crystal=0.298
+**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
+This hierarchical strategy emerges from training.
+## 💻 Usage
+```python
+import torch
+from safetensors.torch import load_file
+from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
+# Load model
+model = GeoFractalDavid(
+    feature_dim=512,
+    num_classes=1000,
+    k=5,
+    scales=[256, 384, 512, 768, 1024, 1280],
+    alpha_init=0.5,
+    tau=0.25
+)
+state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
+model.load_state_dict(state_dict)
+model.eval()
+# Inference
+with torch.no_grad():
+    logits = model(features)  # [batch_size, 1000]
+    predictions = logits.argmax(dim=-1)
+# Inspect learned structure
+print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
+geo_weights = model.get_geometric_weights()
+cantor_dist = model.get_cantor_interval_distribution(sample_features)
+```
+## 🎓 Training Details
+**Loss Function**: Contrastive Geometric Basin
+- Primary: Maximize correct class compatibility, minimize incorrect
+- Regularization: Cantor coherence, separation, discretization
+**Optimization**:
+- Optimizer: AdamW with separate learning rates
+  - Scales: {config.learning_rate}
+  - Fusion weights: {config.learning_rate * 0.5}
+  - Cantor stairs: {config.learning_rate * 0.1}
+- Weight decay: {config.weight_decay}
+- Gradient clipping: {config.gradient_clip}
+- Scheduler: {config.scheduler_type}
+**Data**:
+- Dataset: ImageNet-1K CLIP features ({config.model_variant})
+- Batch size: {config.batch_size}
+- Training samples: 1,281,167
+- Validation samples: 50,000
+**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
+## 🔑 Key Innovation
+**No Cross-Entropy on Arbitrary Weights**
+Traditional: `cross_entropy(W @ features + b, labels)`
+- W and b are arbitrary learned parameters
+**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
+- Compatibility from geometric structure:
+  - Feature ↔ Simplex centroid similarity
+  - Feature ↔ Cantor prototype coherence
+  - Feature ↔ Simplex vertex distance
+- Cross-entropy applied to geometrically meaningful scores
+- Structure enforced through geometric regularization
+Result: Classification emerges from geometric organization, not arbitrary mappings.
+## 📊 Visualizations
+The repository includes visualizations of learned structure:
+- Cantor prototype distributions (histograms per scale)
+- Sorted prototype curves (showing smooth manifold)
+- Cross-scale analysis (mean, variance, geometric weights)
+See `weights/{model_name}/{config.run_id}/` for generated plots.
+## 📁 Repository Structure
+```
+weights/{model_name}/{config.run_id}/
+  ├── best_model_acc{best_acc:.2f}.safetensors    # Model weights
+  ├── best_model_acc{best_acc:.2f}_metadata.json  # Training metadata
+  ├── train_config.json                          # Training configuration
+  ├── training_history.json                      # Epoch-by-epoch history
+  ├── cantor_prototypes_distribution.png         # Histogram analysis
+  ├── cantor_prototypes_sorted.png              # Sorted manifold view
+  └── cantor_prototypes_cross_scale.png         # Cross-scale comparison
+runs/{model_name}/{config.run_id}/
+  └── events.out.tfevents.*                      # TensorBoard logs
+```
+**Note**: Visualizations (*.png) are generated by running the probe script and should be
+copied to the weights directory before uploading to Hub.
+## 🔬 Research
+This architecture demonstrates:
+1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
+2. **Geometric organization** (classes spread smoothly in Cantor space)
+3. **Hierarchical strategy** (scales learn different geometric weightings)
+4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
+The geometric constraints guide learning toward structured representations
+without explicit supervision of the geometric components.
+## 📝 Citation
+```bibtex
+@software{{geofractaldavid2025,
+  title = {{GeoFractalDavid: Geometric Basin Classification}},
+  author = {{AbstractPhil}},
+  year = {{2025}},
+  url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
+  note = {{Multi-scale geometric basin classifier with k-simplex structure}}
+}}
+```
+## 📄 License
+MIT License - See LICENSE file for details.
+---
+*Model trained on {datetime.now().strftime('%Y-%m-%d')}*
+*Run ID: {config.run_id}*

weights/GeoFractalDavid-Basin-k12/20251016_020120/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:19ea4513768845651cd40abf613e51133125843b7f714e3c05f96436481609fc
+size 252218244

weights/GeoFractalDavid-Basin-k12/20251016_020120/model_metadata.json ADDED Viewed

	@@ -0,0 +1,168 @@

+{
+  "epoch": 1,
+  "metrics": {
+    "val_acc": 67.692,
+    "train_acc": 68.72242260376672,
+    "scale_accuracies": {
+      "384": 66.158,
+      "512": 66.398,
+      "768": 67.006,
+      "1024": 65.698,
+      "1280": 61.634
+    },
+    "best_val_acc": 67.692,
+    "best_epoch": 1,
+    "final_train_acc": 68.72242260376672,
+    "training_time": "3m"
+  },
+  "config": {
+    "name": "geofractal_david_basin",
+    "run_id": "20251016_020120",
+    "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
+    "model_variant": "clip_vit_b32",
+    "num_classes": 1000,
+    "feature_dim": 512,
+    "scales": [
+      384,
+      512,
+      768,
+      1024,
+      1280
+    ],
+    "k": 12,
+    "alpha_init": 0.25,
+    "tau": 0.25,
+    "w_coherence": 0.6,
+    "w_separation": 0.35,
+    "w_discretization": 0.05,
+    "w_geometry": 0.7,
+    "w_classification": 5.0,
+    "cantor_margin": 0.1,
+    "cantor_targets": [
+      0.0,
+      0.5,
+      1.0
+    ],
+    "num_epochs": 10,
+    "batch_size": 1024,
+    "learning_rate": 0.001,
+    "weight_decay": 1e-05,
+    "warmup_epochs": 2,
+    "gradient_clip": 5.0,
+    "scheduler_type": "cosine",
+    "min_lr": 1e-06,
+    "log_interval": 50,
+    "val_interval": 1,
+    "save_interval": 5,
+    "base_dir": "./geofractal_training",
+    "num_workers": 6,
+    "pin_memory": true,
+    "prefetch_factor": 6,
+    "persistent_workers": true,
+    "hf_repo": "AbstractPhil/geofractal-david",
+    "upload_to_hub": true,
+    "private_repo": false,
+    "hub_upload_interval": 2
+  },
+  "diagnostics": {
+    "alpha_summary": {
+      "global": {
+        "initial": 0.329033762216568,
+        "final": 0.31584006547927856,
+        "change": -0.013193696737289429,
+        "converged_to_0.5": false
+      }
+    },
+    "cantor_prototypes": {
+      "384": {
+        "final_mean": 0.2949255406856537,
+        "final_std": 0.11593744903802872,
+        "final_range": [
+          0.06953522562980652,
+          0.4994584918022156
+        ]
+      },
+      "512": {
+        "final_mean": 0.2941948473453522,
+        "final_std": 0.11603859812021255,
+        "final_range": [
+          0.068989597260952,
+          0.4994165003299713
+        ]
+      },
+      "768": {
+        "final_mean": 0.3039235472679138,
+        "final_std": 0.11473686248064041,
+        "final_range": [
+          0.07456477731466293,
+          0.5010157823562622
+        ]
+      },
+      "1024": {
+        "final_mean": 0.2993130087852478,
+        "final_std": 0.11533719301223755,
+        "final_range": [
+          0.07265479117631912,
+          0.49978458881378174
+        ]
+      },
+      "1280": {
+        "final_mean": 0.2972503900527954,
+        "final_std": 0.11560141295194626,
+        "final_range": [
+          0.07099252939224243,
+          0.49970734119415283
+        ]
+      }
+    },
+    "geo_weights": {
+      "384": {
+        "feature": 0.7648592591285706,
+        "cantor": 0.07024633139371872,
+        "crystal": 0.16489434242248535
+      },
+      "512": {
+        "feature": 0.7165910005569458,
+        "cantor": 0.07196629792451859,
+        "crystal": 0.21144266426563263
+      },
+      "768": {
+        "feature": 0.86624675989151,
+        "cantor": 0.02973315492272377,
+        "crystal": 0.10402002185583115
+      },
+      "1024": {
+        "feature": 0.744249165058136,
+        "cantor": 0.04059043526649475,
+        "crystal": 0.21516045928001404
+      },
+      "1280": {
+        "feature": 0.6605481505393982,
+        "cantor": 0.0416710264980793,
+        "crystal": 0.297780841588974
+      }
+    },
+    "training_history": {
+      "epochs": [
+        1,
+        2
+      ],
+      "train_loss": [
+        2.19502723750215,
+        1.7239762367531895
+      ],
+      "train_acc": [
+        61.41049527501099,
+        68.72242260376672
+      ],
+      "val_acc": [
+        66.01,
+        67.692
+      ],
+      "lr": [
+        0.001,
+        0.0009755527298894294
+      ]
+    }
+  }
+}

weights/GeoFractalDavid-Basin-k12/20251016_020120/train_config.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+  "name": "geofractal_david_basin",
+  "run_id": "20251016_020120",
+  "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
+  "model_variant": "clip_vit_b32",
+  "num_classes": 1000,
+  "feature_dim": 512,
+  "scales": [
+    384,
+    512,
+    768,
+    1024,
+    1280
+  ],
+  "k": 12,
+  "alpha_init": 0.25,
+  "tau": 0.25,
+  "w_coherence": 0.6,
+  "w_separation": 0.35,
+  "w_discretization": 0.05,
+  "w_geometry": 0.7,
+  "w_classification": 5.0,
+  "cantor_margin": 0.1,
+  "cantor_targets": [
+    0.0,
+    0.5,
+    1.0
+  ],
+  "num_epochs": 10,
+  "batch_size": 1024,
+  "learning_rate": 0.001,
+  "weight_decay": 1e-05,
+  "warmup_epochs": 2,
+  "gradient_clip": 5.0,
+  "scheduler_type": "cosine",
+  "min_lr": 1e-06,
+  "log_interval": 50,
+  "val_interval": 1,
+  "save_interval": 5,
+  "base_dir": "./geofractal_training",
+  "num_workers": 6,
+  "pin_memory": true,
+  "prefetch_factor": 6,
+  "persistent_workers": true,
+  "hf_repo": "AbstractPhil/geofractal-david",
+  "upload_to_hub": true,
+  "private_repo": false,
+  "hub_upload_interval": 2
+}

weights/GeoFractalDavid-Basin-k12/20251016_020120/training_history.json ADDED Viewed

	@@ -0,0 +1,84 @@

+{
+  "training_history": {
+    "epochs": [
+      1,
+      2
+    ],
+    "train_loss": [
+      2.19502723750215,
+      1.7239762367531895
+    ],
+    "train_acc": [
+      61.41049527501099,
+      68.72242260376672
+    ],
+    "val_acc": [
+      66.01,
+      67.692
+    ],
+    "lr": [
+      0.001,
+      0.0009755527298894294
+    ]
+  },
+  "loss_components": {
+    "contrastive": [
+      2.078349736647103,
+      1.6415592758609845
+    ],
+    "correct": [
+      0.672378426352248,
+      0.5540952974329361
+    ],
+    "incorrect": [
+      0.4581509334639238,
+      0.5116512704485903
+    ],
+    "contrast": [
+      2.3537916830553414,
+      1.663276688359416
+    ],
+    "coherence": [
+      0.17484173516210294,
+      0.11351838842534233
+    ],
+    "separation": [
+      0.01637800338190081,
+      0.023601053461741926
+    ],
+    "discretization": [
+      0.1208030286117103,
+      0.1209111282417474
+    ],
+    "total": [
+      2.19502723750215,
+      1.7239762367531895
+    ]
+  },
+  "scale_accuracies": {
+    "384": [
+      65.416,
+      66.158
+    ],
+    "512": [
+      65.604,
+      66.398
+    ],
+    "768": [
+      65.222,
+      67.006
+    ],
+    "1024": [
+      64.774,
+      65.698
+    ],
+    "1280": [
+      63.768,
+      61.634
+    ]
+  },
+  "alpha_history": [
+    0.329033762216568,
+    0.31584006547927856
+  ]
+}