geofractal-david / README.md
AbstractPhil's picture
Update README - GeoFractalDavid-Basin-k12 - Run 20251016_020120 - Acc 71.40%
b28dbb6 verified
---
language: en
license: mit
tags:
- image-classification
- imagenet
- geometric-basin
- cantor-coherence
- multi-scale
- geofractaldavid
datasets:
- imagenet-1k
metrics:
- accuracy
library_name: pytorch
model-index:
- name: GeoFractalDavid-Basin-k12
results:
- task:
type: image-classification
dataset:
name: ImageNet-1K
type: imagenet-1k
metrics:
- type: accuracy
value: 71.40
name: Validation Accuracy
---
# GeoFractalDavid-Basin-k12: Geometric Basin Classification
**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.
## 🎯 Performance
- **Best Validation Accuracy**: 71.40%
- **Epoch**: 10/10
- **Training Time**: 18m 45s
### Per-Scale Performance
- **Scale 384D**: 61.25%
- **Scale 512D**: 60.67%
- **Scale 768D**: 70.50%
- **Scale 1024D**: 51.69%
- **Scale 1280D**: 44.72%
## πŸ—οΈ Architecture
**Model Type**: Multi-scale geometric basin classifier
**Core Components**:
- **Feature Dimension**: 512
- **Number of Classes**: 1000
- **k-Simplex Structure**: k=12 (13 vertices per class)
- **Scales**: [384, 512, 768, 1024, 1280]
- **Total Simplex Vertices**: 13,000
**Geometric Components**:
1. **Feature Similarity**: Cosine similarity to k-simplex centroids
2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
3. **Crystal Geometry**: Distance to nearest simplex vertex
Each scale learns to weight these components differently.
## πŸ”¬ Learned Structure
### Alpha Convergence (Global Cantor Stairs)
The alpha parameter controls middle-interval weighting in the Cantor staircase.
- **Initial**: 0.3290
- **Final**: -0.0764
- **Change**: -0.4055
- **Converged to 0.5**: False
The Cantor staircase uses soft triadic decomposition with learnable alpha to map
features into [0,1] space with fractal structure.
### Cantor Prototype Distribution
Each class has a learned scalar Cantor prototype. The model pulls features toward
their class's Cantor position.
**Scale 384D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1894]
**Scale 512D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1895]
**Scale 768D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1373, 0.1897]
**Scale 1024D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1375, 0.1896]
**Scale 1280D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1375, 0.1898]
Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
This creates a continuous manifold rather than discrete bins.
### Geometric Weight Evolution
Each scale learns optimal weights for combining geometric components:
**Scale 384D**: Feature=0.929, Cantor=0.020, Crystal=0.051
**Scale 512D**: Feature=0.885, Cantor=0.023, Crystal=0.092
**Scale 768D**: Feature=0.996, Cantor=0.001, Crystal=0.003
**Scale 1024D**: Feature=0.952, Cantor=0.005, Crystal=0.043
**Scale 1280D**: Feature=0.411, Cantor=0.003, Crystal=0.587
**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
This hierarchical strategy emerges from training.
## πŸ’» Usage
```python
import torch
from safetensors.torch import load_file
from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid
# Load model
model = GeoFractalDavid(
feature_dim=512,
num_classes=1000,
k=5,
scales=[256, 384, 512, 768, 1024, 1280],
alpha_init=0.5,
tau=0.25
)
state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
model.load_state_dict(state_dict)
model.eval()
# Inference
with torch.no_grad():
logits = model(features) # [batch_size, 1000]
predictions = logits.argmax(dim=-1)
# Inspect learned structure
print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
geo_weights = model.get_geometric_weights()
cantor_dist = model.get_cantor_interval_distribution(sample_features)
```
## πŸŽ“ Training Details
**Loss Function**: Contrastive Geometric Basin
- Primary: Maximize correct class compatibility, minimize incorrect
- Regularization: Cantor coherence, separation, discretization
**Optimization**:
- Optimizer: AdamW with separate learning rates
- Scales: {config.learning_rate}
- Fusion weights: {config.learning_rate * 0.5}
- Cantor stairs: {config.learning_rate * 0.1}
- Weight decay: {config.weight_decay}
- Gradient clipping: {config.gradient_clip}
- Scheduler: {config.scheduler_type}
**Data**:
- Dataset: ImageNet-1K CLIP features ({config.model_variant})
- Batch size: {config.batch_size}
- Training samples: 1,281,167
- Validation samples: 50,000
**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}
## πŸ”‘ Key Innovation
**No Cross-Entropy on Arbitrary Weights**
Traditional: `cross_entropy(W @ features + b, labels)`
- W and b are arbitrary learned parameters
**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
- Compatibility from geometric structure:
- Feature ↔ Simplex centroid similarity
- Feature ↔ Cantor prototype coherence
- Feature ↔ Simplex vertex distance
- Cross-entropy applied to geometrically meaningful scores
- Structure enforced through geometric regularization
Result: Classification emerges from geometric organization, not arbitrary mappings.
## πŸ“Š Visualizations
The repository includes visualizations of learned structure:
- Cantor prototype distributions (histograms per scale)
- Sorted prototype curves (showing smooth manifold)
- Cross-scale analysis (mean, variance, geometric weights)
See `weights/{model_name}/{config.run_id}/` for generated plots.
## πŸ“ Repository Structure
```
weights/{model_name}/{config.run_id}/
β”œβ”€β”€ best_model_acc{best_acc:.2f}.safetensors # Model weights
β”œβ”€β”€ best_model_acc{best_acc:.2f}_metadata.json # Training metadata
β”œβ”€β”€ train_config.json # Training configuration
β”œβ”€β”€ training_history.json # Epoch-by-epoch history
β”œβ”€β”€ cantor_prototypes_distribution.png # Histogram analysis
β”œβ”€β”€ cantor_prototypes_sorted.png # Sorted manifold view
└── cantor_prototypes_cross_scale.png # Cross-scale comparison
runs/{model_name}/{config.run_id}/
└── events.out.tfevents.* # TensorBoard logs
```
**Note**: Visualizations (*.png) are generated by running the probe script and should be
copied to the weights directory before uploading to Hub.
## πŸ”¬ Research
This architecture demonstrates:
1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
2. **Geometric organization** (classes spread smoothly in Cantor space)
3. **Hierarchical strategy** (scales learn different geometric weightings)
4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)
The geometric constraints guide learning toward structured representations
without explicit supervision of the geometric components.
## πŸ“ Citation
```bibtex
@software{{geofractaldavid2025,
title = {{GeoFractalDavid: Geometric Basin Classification}},
author = {{AbstractPhil}},
year = {{2025}},
url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
note = {{Multi-scale geometric basin classifier with k-simplex structure}}
}}
```
## πŸ“„ License
MIT License - See LICENSE file for details.
---
*Model trained on {datetime.now().strftime('%Y-%m-%d')}*
*Run ID: {config.run_id}*