--- language: en license: mit tags: - image-classification - imagenet - geometric-basin - cantor-coherence - multi-scale - geofractaldavid datasets: - imagenet-1k metrics: - accuracy library_name: pytorch model-index: - name: GeoFractalDavid-Basin-k12 results: - task: type: image-classification dataset: name: ImageNet-1K type: imagenet-1k metrics: - type: accuracy value: 71.40 name: Validation Accuracy --- # GeoFractalDavid-Basin-k12: Geometric Basin Classification **GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy. Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure. ## 🎯 Performance - **Best Validation Accuracy**: 71.40% - **Epoch**: 10/10 - **Training Time**: 18m 45s ### Per-Scale Performance - **Scale 384D**: 61.25% - **Scale 512D**: 60.67% - **Scale 768D**: 70.50% - **Scale 1024D**: 51.69% - **Scale 1280D**: 44.72% ## 🏗️ Architecture **Model Type**: Multi-scale geometric basin classifier **Core Components**: - **Feature Dimension**: 512 - **Number of Classes**: 1000 - **k-Simplex Structure**: k=12 (13 vertices per class) - **Scales**: [384, 512, 768, 1024, 1280] - **Total Simplex Vertices**: 13,000 **Geometric Components**: 1. **Feature Similarity**: Cosine similarity to k-simplex centroids 2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized) 3. **Crystal Geometry**: Distance to nearest simplex vertex Each scale learns to weight these components differently. ## 🔬 Learned Structure ### Alpha Convergence (Global Cantor Stairs) The alpha parameter controls middle-interval weighting in the Cantor staircase. - **Initial**: 0.3290 - **Final**: -0.0764 - **Change**: -0.4055 - **Converged to 0.5**: False The Cantor staircase uses soft triadic decomposition with learnable alpha to map features into [0,1] space with fractal structure. ### Cantor Prototype Distribution Each class has a learned scalar Cantor prototype. The model pulls features toward their class's Cantor position. **Scale 384D**: - Mean: 0.0226 - Std: 0.0784 - Range: [-0.1377, 0.1894] **Scale 512D**: - Mean: 0.0226 - Std: 0.0784 - Range: [-0.1377, 0.1895] **Scale 768D**: - Mean: 0.0227 - Std: 0.0784 - Range: [-0.1373, 0.1897] **Scale 1024D**: - Mean: 0.0226 - Std: 0.0784 - Range: [-0.1375, 0.1896] **Scale 1280D**: - Mean: 0.0227 - Std: 0.0784 - Range: [-0.1375, 0.1898] Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1]. This creates a continuous manifold rather than discrete bins. ### Geometric Weight Evolution Each scale learns optimal weights for combining geometric components: **Scale 384D**: Feature=0.929, Cantor=0.020, Crystal=0.051 **Scale 512D**: Feature=0.885, Cantor=0.023, Crystal=0.092 **Scale 768D**: Feature=0.996, Cantor=0.001, Crystal=0.003 **Scale 1024D**: Feature=0.952, Cantor=0.005, Crystal=0.043 **Scale 1280D**: Feature=0.411, Cantor=0.003, Crystal=0.587 **Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry. This hierarchical strategy emerges from training. ## 💻 Usage ```python import torch from safetensors.torch import load_file from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid # Load model model = GeoFractalDavid( feature_dim=512, num_classes=1000, k=5, scales=[256, 384, 512, 768, 1024, 1280], alpha_init=0.5, tau=0.25 ) state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors") model.load_state_dict(state_dict) model.eval() # Inference with torch.no_grad(): logits = model(features) # [batch_size, 1000] predictions = logits.argmax(dim=-1) # Inspect learned structure print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}") geo_weights = model.get_geometric_weights() cantor_dist = model.get_cantor_interval_distribution(sample_features) ``` ## 🎓 Training Details **Loss Function**: Contrastive Geometric Basin - Primary: Maximize correct class compatibility, minimize incorrect - Regularization: Cantor coherence, separation, discretization **Optimization**: - Optimizer: AdamW with separate learning rates - Scales: {config.learning_rate} - Fusion weights: {config.learning_rate * 0.5} - Cantor stairs: {config.learning_rate * 0.1} - Weight decay: {config.weight_decay} - Gradient clipping: {config.gradient_clip} - Scheduler: {config.scheduler_type} **Data**: - Dataset: ImageNet-1K CLIP features ({config.model_variant}) - Batch size: {config.batch_size} - Training samples: 1,281,167 - Validation samples: 50,000 **Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"} ## 🔑 Key Innovation **No Cross-Entropy on Arbitrary Weights** Traditional: `cross_entropy(W @ features + b, labels)` - W and b are arbitrary learned parameters **Geometric Basin**: `contrastive_loss(compatibility_scores, labels)` - Compatibility from geometric structure: - Feature ↔ Simplex centroid similarity - Feature ↔ Cantor prototype coherence - Feature ↔ Simplex vertex distance - Cross-entropy applied to geometrically meaningful scores - Structure enforced through geometric regularization Result: Classification emerges from geometric organization, not arbitrary mappings. ## 📊 Visualizations The repository includes visualizations of learned structure: - Cantor prototype distributions (histograms per scale) - Sorted prototype curves (showing smooth manifold) - Cross-scale analysis (mean, variance, geometric weights) See `weights/{model_name}/{config.run_id}/` for generated plots. ## 📁 Repository Structure ``` weights/{model_name}/{config.run_id}/ ├── best_model_acc{best_acc:.2f}.safetensors # Model weights ├── best_model_acc{best_acc:.2f}_metadata.json # Training metadata ├── train_config.json # Training configuration ├── training_history.json # Epoch-by-epoch history ├── cantor_prototypes_distribution.png # Histogram analysis ├── cantor_prototypes_sorted.png # Sorted manifold view └── cantor_prototypes_cross_scale.png # Cross-scale comparison runs/{model_name}/{config.run_id}/ └── events.out.tfevents.* # TensorBoard logs ``` **Note**: Visualizations (*.png) are generated by running the probe script and should be copied to the weights directory before uploading to Hub. ## 🔬 Research This architecture demonstrates: 1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid) 2. **Geometric organization** (classes spread smoothly in Cantor space) 3. **Hierarchical strategy** (scales learn different geometric weightings) 4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally) The geometric constraints guide learning toward structured representations without explicit supervision of the geometric components. ## 📝 Citation ```bibtex @software{{geofractaldavid2025, title = {{GeoFractalDavid: Geometric Basin Classification}}, author = {{AbstractPhil}}, year = {{2025}}, url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}}, note = {{Multi-scale geometric basin classifier with k-simplex structure}} }} ``` ## 📄 License MIT License - See LICENSE file for details. --- *Model trained on {datetime.now().strftime('%Y-%m-%d')}* *Run ID: {config.run_id}*