geofractal-david / README.md

Update README - GeoFractalDavid-Basin-k12 - Run 20251016_020120 - Acc 71.40%

b28dbb6 verified 3 months ago

7.58 kB

	---
	language: en
	license: mit
	tags:
	- image-classification
	- imagenet
	- geometric-basin
	- cantor-coherence
	- multi-scale
	- geofractaldavid
	datasets:
	- imagenet-1k
	metrics:
	- accuracy
	library_name: pytorch
	model-index:
	- name: GeoFractalDavid-Basin-k12
	results:
	- task:
	type: image-classification
	dataset:
	name: ImageNet-1K
	type: imagenet-1k
	metrics:
	- type: accuracy
	value: 71.40
	name: Validation Accuracy
	---

	# GeoFractalDavid-Basin-k12: Geometric Basin Classification

	GeoFractalDavid achieves classification through geometric compatibility rather than cross-entropy.
	Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.

	## 🎯 Performance

	- Best Validation Accuracy: 71.40%
	- Epoch: 10/10
	- Training Time: 18m 45s

	### Per-Scale Performance
	- Scale 384D: 61.25%
	- Scale 512D: 60.67%
	- Scale 768D: 70.50%
	- Scale 1024D: 51.69%
	- Scale 1280D: 44.72%


	## 🏗️ Architecture

	Model Type: Multi-scale geometric basin classifier

	Core Components:
	- Feature Dimension: 512
	- Number of Classes: 1000
	- k-Simplex Structure: k=12 (13 vertices per class)
	- Scales: [384, 512, 768, 1024, 1280]
	- Total Simplex Vertices: 13,000

	Geometric Components:
	1. Feature Similarity: Cosine similarity to k-simplex centroids
	2. Cantor Coherence: Distance to learned Cantor prototypes (alpha-normalized)
	3. Crystal Geometry: Distance to nearest simplex vertex

	Each scale learns to weight these components differently.

	## 🔬 Learned Structure

	### Alpha Convergence (Global Cantor Stairs)

	The alpha parameter controls middle-interval weighting in the Cantor staircase.

	- Initial: 0.3290
	- Final: -0.0764
	- Change: -0.4055
	- Converged to 0.5: False

	The Cantor staircase uses soft triadic decomposition with learnable alpha to map
	features into [0,1] space with fractal structure.

	### Cantor Prototype Distribution

	Each class has a learned scalar Cantor prototype. The model pulls features toward
	their class's Cantor position.

	Scale 384D:
	- Mean: 0.0226
	- Std: 0.0784
	- Range: [-0.1377, 0.1894]

	Scale 512D:
	- Mean: 0.0226
	- Std: 0.0784
	- Range: [-0.1377, 0.1895]

	Scale 768D:
	- Mean: 0.0227
	- Std: 0.0784
	- Range: [-0.1373, 0.1897]

	Scale 1024D:
	- Mean: 0.0226
	- Std: 0.0784
	- Range: [-0.1375, 0.1896]

	Scale 1280D:
	- Mean: 0.0227
	- Std: 0.0784
	- Range: [-0.1375, 0.1898]


	Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
	This creates a continuous manifold rather than discrete bins.

	### Geometric Weight Evolution

	Each scale learns optimal weights for combining geometric components:

	Scale 384D: Feature=0.929, Cantor=0.020, Crystal=0.051
	Scale 512D: Feature=0.885, Cantor=0.023, Crystal=0.092
	Scale 768D: Feature=0.996, Cantor=0.001, Crystal=0.003
	Scale 1024D: Feature=0.952, Cantor=0.005, Crystal=0.043
	Scale 1280D: Feature=0.411, Cantor=0.003, Crystal=0.587


	Pattern: Lower scales rely on feature similarity, higher scales use crystal geometry.
	This hierarchical strategy emerges from training.

	## 💻 Usage

	```python
	import torch
	from safetensors.torch import load_file
	from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid

	# Load model
	model = GeoFractalDavid(
	feature_dim=512,
	num_classes=1000,
	k=5,
	scales=[256, 384, 512, 768, 1024, 1280],
	alpha_init=0.5,
	tau=0.25
	)

	state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
	model.load_state_dict(state_dict)
	model.eval()

	# Inference
	with torch.no_grad():
	logits = model(features) # [batch_size, 1000]
	predictions = logits.argmax(dim=-1)

	# Inspect learned structure
	print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
	geo_weights = model.get_geometric_weights()
	cantor_dist = model.get_cantor_interval_distribution(sample_features)
	```

	## 🎓 Training Details

	Loss Function: Contrastive Geometric Basin
	- Primary: Maximize correct class compatibility, minimize incorrect
	- Regularization: Cantor coherence, separation, discretization

	Optimization:
	- Optimizer: AdamW with separate learning rates
	- Scales: {config.learning_rate}
	- Fusion weights: {config.learning_rate * 0.5}
	- Cantor stairs: {config.learning_rate * 0.1}
	- Weight decay: {config.weight_decay}
	- Gradient clipping: {config.gradient_clip}
	- Scheduler: {config.scheduler_type}

	Data:
	- Dataset: ImageNet-1K CLIP features ({config.model_variant})
	- Batch size: {config.batch_size}
	- Training samples: 1,281,167
	- Validation samples: 50,000

	Hub Upload: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}

	## 🔑 Key Innovation

	No Cross-Entropy on Arbitrary Weights

	Traditional: `cross_entropy(W @ features + b, labels)`
	- W and b are arbitrary learned parameters

	Geometric Basin: `contrastive_loss(compatibility_scores, labels)`
	- Compatibility from geometric structure:
	- Feature ↔ Simplex centroid similarity
	- Feature ↔ Cantor prototype coherence
	- Feature ↔ Simplex vertex distance
	- Cross-entropy applied to geometrically meaningful scores
	- Structure enforced through geometric regularization

	Result: Classification emerges from geometric organization, not arbitrary mappings.

	## 📊 Visualizations

	The repository includes visualizations of learned structure:
	- Cantor prototype distributions (histograms per scale)
	- Sorted prototype curves (showing smooth manifold)
	- Cross-scale analysis (mean, variance, geometric weights)

	See `weights/{model_name}/{config.run_id}/` for generated plots.

	## 📁 Repository Structure

	```
	weights/{model_name}/{config.run_id}/
	├── best_model_acc{best_acc:.2f}.safetensors # Model weights
	├── best_model_acc{best_acc:.2f}_metadata.json # Training metadata
	├── train_config.json # Training configuration
	├── training_history.json # Epoch-by-epoch history
	├── cantor_prototypes_distribution.png # Histogram analysis
	├── cantor_prototypes_sorted.png # Sorted manifold view
	└── cantor_prototypes_cross_scale.png # Cross-scale comparison

	runs/{model_name}/{config.run_id}/
	└── events.out.tfevents.* # TensorBoard logs
	```

	Note: Visualizations (*.png) are generated by running the probe script and should be
	copied to the weights directory before uploading to Hub.

	## 🔬 Research

	This architecture demonstrates:
	1. Rapid learning (70%+ after 1 epoch, comparable to FractalDavid)
	2. Geometric organization (classes spread smoothly in Cantor space)
	3. Hierarchical strategy (scales learn different geometric weightings)
	4. Emergent structure (alpha stays near 0.5, prototypes cluster naturally)

	The geometric constraints guide learning toward structured representations
	without explicit supervision of the geometric components.

	## 📝 Citation

	```bibtex
	@software{{geofractaldavid2025,
	title = {{GeoFractalDavid: Geometric Basin Classification}},
	author = {{AbstractPhil}},
	year = {{2025}},
	url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
	note = {{Multi-scale geometric basin classifier with k-simplex structure}}
	}}
	```

	## 📄 License

	MIT License - See LICENSE file for details.

	---

	Model trained on {datetime.now().strftime('%Y-%m-%d')}
	Run ID: {config.run_id}