---
language: en
license: mit
tags:
- image-classification
- imagenet
- geometric-basin
- cantor-coherence
- multi-scale
- geofractaldavid
datasets:
- imagenet-1k
metrics:
- accuracy
library_name: pytorch
model-index:
- name: GeoFractalDavid-Basin-k12
  results:
  - task:
      type: image-classification
    dataset:
      name: ImageNet-1K
      type: imagenet-1k
    metrics:
    - type: accuracy
      value: 71.40
      name: Validation Accuracy
---

# GeoFractalDavid-Basin-k12: Geometric Basin Classification

**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.

## 🎯 Performance

- **Best Validation Accuracy**: 71.40%
- **Epoch**: 10/10
- **Training Time**: 18m 45s

### Per-Scale Performance
- **Scale 384D**: 61.25%
- **Scale 512D**: 60.67%
- **Scale 768D**: 70.50%
- **Scale 1024D**: 51.69%
- **Scale 1280D**: 44.72%


## 🏗️ Architecture

**Model Type**: Multi-scale geometric basin classifier

**Core Components**:
- **Feature Dimension**: 512
- **Number of Classes**: 1000
- **k-Simplex Structure**: k=12 (13 vertices per class)
- **Scales**: [384, 512, 768, 1024, 1280]
- **Total Simplex Vertices**: 13,000

**Geometric Components**:
1. **Feature Similarity**: Cosine similarity to k-simplex centroids
2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
3. **Crystal Geometry**: Distance to nearest simplex vertex

Each scale learns to weight these components differently.

## 🔬 Learned Structure

### Alpha Convergence (Global Cantor Stairs)

The alpha parameter controls middle-interval weighting in the Cantor staircase.

- **Initial**: 0.3290
- **Final**: -0.0764
- **Change**: -0.4055
- **Converged to 0.5**: False

The Cantor staircase uses soft triadic decomposition with learnable alpha to map
features into [0,1] space with fractal structure.

### Cantor Prototype Distribution

Each class has a learned scalar Cantor prototype. The model pulls features toward
their class's Cantor position.

**Scale 384D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1894]

**Scale 512D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1895]

**Scale 768D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1373, 0.1897]

**Scale 1024D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1375, 0.1896]

**Scale 1280D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1375, 0.1898]


Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
This creates a continuous manifold rather than discrete bins.

### Geometric Weight Evolution

Each scale learns optimal weights for combining geometric components:

**Scale 384D**: Feature=0.929, Cantor=0.020, Crystal=0.051
**Scale 512D**: Feature=0.885, Cantor=0.023, Crystal=0.092
**Scale 768D**: Feature=0.996, Cantor=0.001, Crystal=0.003
**Scale 1024D**: Feature=0.952, Cantor=0.005, Crystal=0.043
**Scale 1280D**: Feature=0.411, Cantor=0.003, Crystal=0.587


**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
This hierarchical strategy emerges from training.

## 💻 Usage

```python
import torch
from safetensors.torch import load_file
from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid

# Load model
model = GeoFractalDavid(
    feature_dim=512,
    num_classes=1000,
    k=5,
    scales=[256, 384, 512, 768, 1024, 1280],
    alpha_init=0.5,
    tau=0.25
)

state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
model.load_state_dict(state_dict)
model.eval()

# Inference
with torch.no_grad():
    logits = model(features)  # [batch_size, 1000]
    predictions = logits.argmax(dim=-1)

# Inspect learned structure
print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
geo_weights = model.get_geometric_weights()
cantor_dist = model.get_cantor_interval_distribution(sample_features)
```

## 🎓 Training Details

**Loss Function**: Contrastive Geometric Basin
- Primary: Maximize correct class compatibility, minimize incorrect
- Regularization: Cantor coherence, separation, discretization

**Optimization**:
- Optimizer: AdamW with separate learning rates
  - Scales: {config.learning_rate}
  - Fusion weights: {config.learning_rate * 0.5}
  - Cantor stairs: {config.learning_rate * 0.1}
- Weight decay: {config.weight_decay}
- Gradient clipping: {config.gradient_clip}
- Scheduler: {config.scheduler_type}

**Data**:
- Dataset: ImageNet-1K CLIP features ({config.model_variant})
- Batch size: {config.batch_size}
- Training samples: 1,281,167
- Validation samples: 50,000

**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}

## 🔑 Key Innovation

**No Cross-Entropy on Arbitrary Weights**

Traditional: `cross_entropy(W @ features + b, labels)`
- W and b are arbitrary learned parameters

**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
- Compatibility from geometric structure:
  - Feature ↔ Simplex centroid similarity
  - Feature ↔ Cantor prototype coherence
  - Feature ↔ Simplex vertex distance
- Cross-entropy applied to geometrically meaningful scores
- Structure enforced through geometric regularization

Result: Classification emerges from geometric organization, not arbitrary mappings.

## 📊 Visualizations

The repository includes visualizations of learned structure:
- Cantor prototype distributions (histograms per scale)
- Sorted prototype curves (showing smooth manifold)
- Cross-scale analysis (mean, variance, geometric weights)

See `weights/{model_name}/{config.run_id}/` for generated plots.

## 📁 Repository Structure

```
weights/{model_name}/{config.run_id}/
  ├── best_model_acc{best_acc:.2f}.safetensors    # Model weights
  ├── best_model_acc{best_acc:.2f}_metadata.json  # Training metadata
  ├── train_config.json                          # Training configuration
  ├── training_history.json                      # Epoch-by-epoch history
  ├── cantor_prototypes_distribution.png         # Histogram analysis
  ├── cantor_prototypes_sorted.png              # Sorted manifold view
  └── cantor_prototypes_cross_scale.png         # Cross-scale comparison

runs/{model_name}/{config.run_id}/
  └── events.out.tfevents.*                      # TensorBoard logs
```

**Note**: Visualizations (*.png) are generated by running the probe script and should be
copied to the weights directory before uploading to Hub.

## 🔬 Research

This architecture demonstrates:
1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
2. **Geometric organization** (classes spread smoothly in Cantor space)
3. **Hierarchical strategy** (scales learn different geometric weightings)
4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)

The geometric constraints guide learning toward structured representations
without explicit supervision of the geometric components.

## 📝 Citation

```bibtex
@software{{geofractaldavid2025,
  title = {{GeoFractalDavid: Geometric Basin Classification}},
  author = {{AbstractPhil}},
  year = {{2025}},
  url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
  note = {{Multi-scale geometric basin classifier with k-simplex structure}}
}}
```

## 📄 License

MIT License - See LICENSE file for details.

---

*Model trained on {datetime.now().strftime('%Y-%m-%d')}*  
*Run ID: {config.run_id}*