Upload weights - GeoFractalDavid-Basin-k9 - Run 20251016_000149 - Acc 78.89%

Browse files

Files changed (5) hide show

weights/GeoFractalDavid-Basin-k9/20251016_000149/README.md +287 -0
weights/GeoFractalDavid-Basin-k9/20251016_000149/model.safetensors +3 -0
weights/GeoFractalDavid-Basin-k9/20251016_000149/model_metadata.json +228 -0
weights/GeoFractalDavid-Basin-k9/20251016_000149/train_config.json +53 -0
weights/GeoFractalDavid-Basin-k9/20251016_000149/training_history.json +100 -0

weights/GeoFractalDavid-Basin-k9/20251016_000149/README.md ADDED Viewed

	@@ -0,0 +1,287 @@

+---
+language: en
+license: mit
+tags:
+- image-classification
+- imagenet
+- geometric-basin
+- cantor-coherence
+- multi-scale
+- geofractaldavid
+datasets:
+- imagenet-1k
+metrics:
+- accuracy
+library_name: pytorch
+model-index:
+- name: GeoFractalDavid-Basin-k9
+  results:
+  - task:
+      type: image-classification
+    dataset:
+      name: ImageNet-1K
+      type: imagenet-1k
+    metrics:
+    - type: accuracy
+      value: 78.89
+      name: Validation Accuracy
+---
+# GeoFractalDavid-Basin-k9: Geometric Basin Classification
+**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
+Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
+## 🎯 Performance
+- **Best Validation Accuracy**: 78.89%
+- **Epoch**: 2/10
+- **Training Time**: 5m
+### Per-Scale Performance
+- **Scale 512D**: 78.11%
+- **Scale 576D**: 78.22%
+- **Scale 640D**: 78.29%
+- **Scale 704D**: 78.22%
+- **Scale 768D**: 78.10%
+- **Scale 832D**: 78.11%
+- **Scale 896D**: 77.50%
+- **Scale 960D**: 77.48%
+- **Scale 1024D**: 76.10%
+## 🏗️ Architecture
+**Model Type**: Multi-scale geometric basin classifier
+**Core Components**:
+- **Feature Dimension**: 768
+- **Number of Classes**: 1000
+- **k-Simplex Structure**: k=9 (10 vertices per class)
+- **Scales**: [512, 576, 640, 704, 768, 832, 896, 960, 1024]
+- **Total Simplex Vertices**: 10,000
+**Geometric Components**:
+1. **Feature Similarity**: Cosine similarity to k-simplex centroids
+2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
+3. **Crystal Geometry**: Distance to nearest simplex vertex
+Each scale learns to weight these components differently.
+## 🔬 Learned Structure
+### Alpha Convergence (Global Cantor Stairs)
+The alpha parameter controls middle-interval weighting in the Cantor staircase.
+- **Initial**: 0.5078
+- **Final**: 0.5387
+- **Change**: +0.0309
+- **Converged to 0.5**: True
+The Cantor staircase uses soft triadic decomposition with learnable alpha to map
+features into [0,1] space with fractal structure.
+### Cantor Prototype Distribution
+Each class has a learned scalar Cantor prototype. The model pulls features toward
+their class's Cantor position.
+**Scale 512D**:
+- Mean: 0.5389
+- Std: 0.1279
+- Range: [0.2747, 0.7762]
+**Scale 576D**:
+- Mean: 0.5391
+- Std: 0.1278
+- Range: [0.2753, 0.7758]
+**Scale 640D**:
+- Mean: 0.5391
+- Std: 0.1277
+- Range: [0.2756, 0.7758]
+**Scale 704D**:
+- Mean: 0.5384
+- Std: 0.1274
+- Range: [0.2757, 0.7740]
+**Scale 768D**:
+- Mean: 0.5344
+- Std: 0.1296
+- Range: [0.2642, 0.7727]
+**Scale 832D**:
+- Mean: 0.5376
+- Std: 0.1279
+- Range: [0.2729, 0.7738]
+**Scale 896D**:
+- Mean: 0.5361
+- Std: 0.1295
+- Range: [0.2662, 0.7758]
+**Scale 960D**:
+- Mean: 0.5367
+- Std: 0.1287
+- Range: [0.2695, 0.7749]
+**Scale 1024D**:
+- Mean: 0.5375
+- Std: 0.1283
+- Range: [0.2718, 0.7747]
+Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
+This creates a continuous manifold rather than discrete bins.
+### Geometric Weight Evolution
+Each scale learns optimal weights for combining geometric components:
+**Scale 512D**: Feature=0.692, Cantor=0.072, Crystal=0.235
+**Scale 576D**: Feature=0.651, Cantor=0.073, Crystal=0.276
+**Scale 640D**: Feature=0.619, Cantor=0.074, Crystal=0.307
+**Scale 704D**: Feature=0.626, Cantor=0.066, Crystal=0.308
+**Scale 768D**: Feature=0.829, Cantor=0.031, Crystal=0.140
+**Scale 832D**: Feature=0.694, Cantor=0.048, Crystal=0.258
+**Scale 896D**: Feature=0.802, Cantor=0.032, Crystal=0.166
+**Scale 960D**: Feature=0.735, Cantor=0.038, Crystal=0.227
+**Scale 1024D**: Feature=0.663, Cantor=0.042, Crystal=0.295
+**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
+This hierarchical strategy emerges from training.
+## 💻 Usage
+```python
+import torch
+from safetensors.torch import load_file
+from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
+# Load model
+model = GeoFractalDavid(
+    feature_dim=512,
+    num_classes=1000,
+    k=5,
+    scales=[256, 384, 512, 768, 1024, 1280],
+    alpha_init=0.5,
+    tau=0.25
+)
+state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
+model.load_state_dict(state_dict)
+model.eval()
+# Inference
+with torch.no_grad():
+    logits = model(features)  # [batch_size, 1000]
+    predictions = logits.argmax(dim=-1)
+# Inspect learned structure
+print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
+geo_weights = model.get_geometric_weights()
+cantor_dist = model.get_cantor_interval_distribution(sample_features)
+```
+## 🎓 Training Details
+**Loss Function**: Contrastive Geometric Basin
+- Primary: Maximize correct class compatibility, minimize incorrect
+- Regularization: Cantor coherence, separation, discretization
+**Optimization**:
+- Optimizer: AdamW with separate learning rates
+  - Scales: {config.learning_rate}
+  - Fusion weights: {config.learning_rate * 0.5}
+  - Cantor stairs: {config.learning_rate * 0.1}
+- Weight decay: {config.weight_decay}
+- Gradient clipping: {config.gradient_clip}
+- Scheduler: {config.scheduler_type}
+**Data**:
+- Dataset: ImageNet-1K CLIP features ({config.model_variant})
+- Batch size: {config.batch_size}
+- Training samples: 1,281,167
+- Validation samples: 50,000
+**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
+## 🔑 Key Innovation
+**No Cross-Entropy on Arbitrary Weights**
+Traditional: `cross_entropy(W @ features + b, labels)`
+- W and b are arbitrary learned parameters
+**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
+- Compatibility from geometric structure:
+  - Feature ↔ Simplex centroid similarity
+  - Feature ↔ Cantor prototype coherence
+  - Feature ↔ Simplex vertex distance
+- Cross-entropy applied to geometrically meaningful scores
+- Structure enforced through geometric regularization
+Result: Classification emerges from geometric organization, not arbitrary mappings.
+## 📊 Visualizations
+The repository includes visualizations of learned structure:
+- Cantor prototype distributions (histograms per scale)
+- Sorted prototype curves (showing smooth manifold)
+- Cross-scale analysis (mean, variance, geometric weights)
+See `weights/{model_name}/{config.run_id}/` for generated plots.
+## 📁 Repository Structure
+```
+weights/{model_name}/{config.run_id}/
+  ├── best_model_acc{best_acc:.2f}.safetensors    # Model weights
+  ├── best_model_acc{best_acc:.2f}_metadata.json  # Training metadata
+  ├── train_config.json                          # Training configuration
+  ├── training_history.json                      # Epoch-by-epoch history
+  ├── cantor_prototypes_distribution.png         # Histogram analysis
+  ├── cantor_prototypes_sorted.png              # Sorted manifold view
+  └── cantor_prototypes_cross_scale.png         # Cross-scale comparison
+runs/{model_name}/{config.run_id}/
+  └── events.out.tfevents.*                      # TensorBoard logs
+```
+**Note**: Visualizations (*.png) are generated by running the probe script and should be
+copied to the weights directory before uploading to Hub.
+## 🔬 Research
+This architecture demonstrates:
+1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
+2. **Geometric organization** (classes spread smoothly in Cantor space)
+3. **Hierarchical strategy** (scales learn different geometric weightings)
+4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
+The geometric constraints guide learning toward structured representations
+without explicit supervision of the geometric components.
+## 📝 Citation
+```bibtex
+@software{{geofractaldavid2025,
+  title = {{GeoFractalDavid: Geometric Basin Classification}},
+  author = {{AbstractPhil}},
+  year = {{2025}},
+  url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
+  note = {{Multi-scale geometric basin classifier with k-simplex structure}}
+}}
+```
+## 📄 License
+MIT License - See LICENSE file for details.
+---
+*Model trained on {datetime.now().strftime('%Y-%m-%d')}*
+*Run ID: {config.run_id}*

weights/GeoFractalDavid-Basin-k9/20251016_000149/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bff67a7ee12ff64dd391d067010635a04c9858b0a781b3f1c35a13a06575bb88
+size 363621916

weights/GeoFractalDavid-Basin-k9/20251016_000149/model_metadata.json ADDED Viewed

	@@ -0,0 +1,228 @@

+{
+  "epoch": 1,
+  "metrics": {
+    "val_acc": 78.888,
+    "train_acc": 80.14966042678277,
+    "scale_accuracies": {
+      "512": 78.108,
+      "576": 78.224,
+      "640": 78.294,
+      "704": 78.222,
+      "768": 78.096,
+      "832": 78.112,
+      "896": 77.504,
+      "960": 77.484,
+      "1024": 76.1
+    },
+    "best_val_acc": 78.888,
+    "best_epoch": 1,
+    "final_train_acc": 80.14966042678277,
+    "training_time": "5m"
+  },
+  "config": {
+    "name": "geofractal_david_basin",
+    "run_id": "20251016_000149",
+    "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
+    "model_variant": "clip_vit_l14",
+    "num_classes": 1000,
+    "feature_dim": 768,
+    "scales": [
+      512,
+      576,
+      640,
+      704,
+      768,
+      832,
+      896,
+      960,
+      1024
+    ],
+    "k": 9,
+    "alpha_init": 0.5,
+    "tau": 0.25,
+    "w_coherence": 0.5,
+    "w_separation": 0.3,
+    "w_discretization": 0.0,
+    "w_geometry": 0.7,
+    "w_classification": 5.0,
+    "cantor_margin": 0.1,
+    "cantor_targets": [
+      0.0,
+      0.5,
+      1.0
+    ],
+    "num_epochs": 10,
+    "batch_size": 1024,
+    "learning_rate": 0.001,
+    "weight_decay": 1e-05,
+    "warmup_epochs": 2,
+    "gradient_clip": 5.0,
+    "scheduler_type": "cosine",
+    "min_lr": 1e-06,
+    "log_interval": 50,
+    "val_interval": 1,
+    "save_interval": 5,
+    "base_dir": "./geofractal_training",
+    "num_workers": 6,
+    "pin_memory": true,
+    "prefetch_factor": 6,
+    "persistent_workers": true,
+    "hf_repo": "AbstractPhil/geofractal-david",
+    "upload_to_hub": true,
+    "private_repo": false,
+    "hub_upload_interval": 2
+  },
+  "diagnostics": {
+    "alpha_summary": {
+      "global": {
+        "initial": 0.5078181624412537,
+        "final": 0.5387266874313354,
+        "change": 0.030908524990081787,
+        "converged_to_0.5": true
+      }
+    },
+    "cantor_prototypes": {
+      "512": {
+        "final_mean": 0.5389490127563477,
+        "final_std": 0.1278502196073532,
+        "final_range": [
+          0.27474915981292725,
+          0.7761662602424622
+        ]
+      },
+      "576": {
+        "final_mean": 0.5391144156455994,
+        "final_std": 0.12775833904743195,
+        "final_range": [
+          0.2752755582332611,
+          0.7758248448371887
+        ]
+      },
+      "640": {
+        "final_mean": 0.5391111373901367,
+        "final_std": 0.12765344977378845,
+        "final_range": [
+          0.2756292223930359,
+          0.775783896446228
+        ]
+      },
+      "704": {
+        "final_mean": 0.5384144186973572,
+        "final_std": 0.12739966809749603,
+        "final_range": [
+          0.27569496631622314,
+          0.7740228176116943
+        ]
+      },
+      "768": {
+        "final_mean": 0.5344298481941223,
+        "final_std": 0.1295781135559082,
+        "final_range": [
+          0.26418763399124146,
+          0.7726764678955078
+        ]
+      },
+      "832": {
+        "final_mean": 0.5376392602920532,
+        "final_std": 0.12788981199264526,
+        "final_range": [
+          0.2729322016239166,
+          0.7738057971000671
+        ]
+      },
+      "896": {
+        "final_mean": 0.5361161231994629,
+        "final_std": 0.12952949106693268,
+        "final_range": [
+          0.2661586105823517,
+          0.775814950466156
+        ]
+      },
+      "960": {
+        "final_mean": 0.5367220640182495,
+        "final_std": 0.1286790519952774,
+        "final_range": [
+          0.2695462107658386,
+          0.7749242186546326
+        ]
+      },
+      "1024": {
+        "final_mean": 0.5374913811683655,
+        "final_std": 0.12825900316238403,
+        "final_range": [
+          0.2718084156513214,
+          0.774700939655304
+        ]
+      }
+    },
+    "geo_weights": {
+      "512": {
+        "feature": 0.6921859979629517,
+        "cantor": 0.07245893776416779,
+        "crystal": 0.23535509407520294
+      },
+      "576": {
+        "feature": 0.650705099105835,
+        "cantor": 0.07325796037912369,
+        "crystal": 0.2760368585586548
+      },
+      "640": {
+        "feature": 0.6189382672309875,
+        "cantor": 0.07382705062627792,
+        "crystal": 0.3072347342967987
+      },
+      "704": {
+        "feature": 0.6264040470123291,
+        "cantor": 0.06589735299348831,
+        "crystal": 0.3076985776424408
+      },
+      "768": {
+        "feature": 0.8293132185935974,
+        "cantor": 0.030711539089679718,
+        "crystal": 0.13997520506381989
+      },
+      "832": {
+        "feature": 0.6941732168197632,
+        "cantor": 0.04803140461444855,
+        "crystal": 0.2577953040599823
+      },
+      "896": {
+        "feature": 0.8018559217453003,
+        "cantor": 0.03219277039170265,
+        "crystal": 0.16595134139060974
+      },
+      "960": {
+        "feature": 0.7354097962379456,
+        "cantor": 0.037818897515535355,
+        "crystal": 0.22677132487297058
+      },
+      "1024": {
+        "feature": 0.662774920463562,
+        "cantor": 0.04228794202208519,
+        "crystal": 0.2949370741844177
+      }
+    },
+    "training_history": {
+      "epochs": [
+        1,
+        2
+      ],
+      "train_loss": [
+        1.9504992645769454,
+        1.4699646668693127
+      ],
+      "train_acc": [
+        74.06965680508473,
+        80.14966042678277
+      ],
+      "val_acc": [
+        78.016,
+        78.888
+      ],
+      "lr": [
+        0.001,
+        0.0009755527298894294
+      ]
+    }
+  }
+}

weights/GeoFractalDavid-Basin-k9/20251016_000149/train_config.json ADDED Viewed

	@@ -0,0 +1,53 @@

+{
+  "name": "geofractal_david_basin",
+  "run_id": "20251016_000149",
+  "dataset_name": "AbstractPhil/imagenet-clip-features-orderly",
+  "model_variant": "clip_vit_l14",
+  "num_classes": 1000,
+  "feature_dim": 768,
+  "scales": [
+    512,
+    576,
+    640,
+    704,
+    768,
+    832,
+    896,
+    960,
+    1024
+  ],
+  "k": 9,
+  "alpha_init": 0.5,
+  "tau": 0.25,
+  "w_coherence": 0.5,
+  "w_separation": 0.3,
+  "w_discretization": 0.0,
+  "w_geometry": 0.7,
+  "w_classification": 5.0,
+  "cantor_margin": 0.1,
+  "cantor_targets": [
+    0.0,
+    0.5,
+    1.0
+  ],
+  "num_epochs": 10,
+  "batch_size": 1024,
+  "learning_rate": 0.001,
+  "weight_decay": 1e-05,
+  "warmup_epochs": 2,
+  "gradient_clip": 5.0,
+  "scheduler_type": "cosine",
+  "min_lr": 1e-06,
+  "log_interval": 50,
+  "val_interval": 1,
+  "save_interval": 5,
+  "base_dir": "./geofractal_training",
+  "num_workers": 6,
+  "pin_memory": true,
+  "prefetch_factor": 6,
+  "persistent_workers": true,
+  "hf_repo": "AbstractPhil/geofractal-david",
+  "upload_to_hub": true,
+  "private_repo": false,
+  "hub_upload_interval": 2
+}

weights/GeoFractalDavid-Basin-k9/20251016_000149/training_history.json ADDED Viewed

	@@ -0,0 +1,100 @@

+{
+  "training_history": {
+    "epochs": [
+      1,
+      2
+    ],
+    "train_loss": [
+      1.9504992645769454,
+      1.4699646668693127
+    ],
+    "train_acc": [
+      74.06965680508473,
+      80.14966042678277
+    ],
+    "val_acc": [
+      78.016,
+      78.888
+    ],
+    "lr": [
+      0.001,
+      0.0009755527298894294
+    ]
+  },
+  "loss_components": {
+    "contrastive": [
+      1.8601206661984562,
+      1.3918586831313733
+    ],
+    "correct": [
+      0.6632939531399419,
+      0.55859763477557
+    ],
+    "incorrect": [
+      0.45201984307350823,
+      0.4855721479359145
+    ],
+    "contrast": [
+      1.9416335886850145,
+      1.1809499480853827
+    ],
+    "coherence": [
+      0.17106025584933393,
+      0.14298121159698582
+    ],
+    "separation": [
+      0.016161556452275388,
+      0.022051267393652745
+    ],
+    "discretization": [
+      0.12655279778716366,
+      0.10626367490208324
+    ],
+    "total": [
+      1.9504992645769454,
+      1.4699646668693127
+    ]
+  },
+  "scale_accuracies": {
+    "512": [
+      77.674,
+      78.108
+    ],
+    "576": [
+      77.846,
+      78.224
+    ],
+    "640": [
+      77.72,
+      78.294
+    ],
+    "704": [
+      77.636,
+      78.222
+    ],
+    "768": [
+      77.214,
+      78.096
+    ],
+    "832": [
+      77.274,
+      78.112
+    ],
+    "896": [
+      77.268,
+      77.504
+    ],
+    "960": [
+      77.114,
+      77.484
+    ],
+    "1024": [
+      77.0,
+      76.1
+    ]
+  },
+  "alpha_history": [
+    0.5078181624412537,
+    0.5387266874313354
+  ]
+}