vit-beatrix / README.md

Update README.md

7ba2314 verified 4 months ago

6.91 kB

	---
	tags:
	- vision
	- image-classification
	- fractal-positional-encoding
	- geometric-deep-learning
	- devil-staircase
	- simplex-geometry
	license: mit
	---

	# ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer

	This repository contains Vision Transformers integrating Devil's Staircase positional encoding
	with geometric simplex features for vision tasks.

	## Key Features

	- Fractal Positional Encoding: Devil's Staircase multi-scale position embeddings
	- Geometric Simplex Features: k-simplex vertex computations from Cantor measure
	- SimplexFactory Initialization: Pre-initialized simplices with geometrically meaningful shapes (regular/random/uniform)
	- Adaptive Augmentation: Progressive augmentation escalation to prevent overfitting
	- Beatrix Formula Suite: Flow alignment, hierarchical coherence, and multi-scale consistency losses

	### Simplex Initialization

	Instead of random initialization, the model uses SimplexFactory to create geometrically sound starting configurations:

	- Regular (default): All edges equal length, perfectly balanced symmetric structure
	- Random: QR decomposition ensuring affine independence
	- Uniform: Hypercube sampling with perturbations

	Regular simplices provide the most stable and mathematically meaningful initialization, giving the model a better starting point for learning geometric features.

	### Adaptive Augmentation System

	The trainer includes an intelligent augmentation system that monitors train/validation accuracy gap and progressively enables more augmentation:

	1. Baseline: RandomCrop + RandomHorizontalFlip
	2. Stage 1: + ColorJitter
	3. Stage 2: + RandomRotation
	4. Stage 3: + RandomAffine
	5. Stage 4: + RandomErasing
	6. Stage 5: + AutoAugment (CIFAR policy)
	7. Stage 6: Enable Mixup (α=0.2)
	8. Stage 7: Enable CutMix (α=1.0) - Final stage

	When train accuracy exceeds validation accuracy by 2% or more, the system automatically escalates to the next augmentation stage.

	## Available Models (Best Checkpoints Only)

	\| Model Name \| Training Session \| Accuracy \| Epoch \| Weights Path \| Logs Path \|
	\|------------\|------------------\|----------\|-------\|--------------\|----------\|
	\| beatrix-cifar100 \| 20251007_182851 \| 0.5819 \| 42 \| `weights/beatrix-cifar100/20251007_182851` \| `N/A` \|
	\| beatrix-simplex4-patch4-512d-flow \| 20251008_115206 \| 0.5674 \| 87 \| `weights/beatrix-simplex4-patch4-512d-flow/20251008_115206` \| `logs/beatrix-simplex4-patch4-512d-flow/20251008_115206` \|
	\| beatrix-simplex7-patch4-256d-ce \| 20251008_034231 \| 0.5372 \| 77 \| `weights/beatrix-simplex7-patch4-256d-ce/20251008_034231` \| `logs/beatrix-simplex7-patch4-256d-ce/20251008_034231` \|
	\| beatrix-simplex7-patch4-256d \| 20251008_020048 \| 0.5291 \| 89 \| `weights/beatrix-simplex7-patch4-256d/20251008_020048` \| `logs/beatrix-simplex7-patch4-256d/20251008_020048` \|
	\| beatrix-cifar100 \| 20251007_215344 \| 0.5161 \| 41 \| `weights/beatrix-cifar100/20251007_215344` \| `logs/beatrix-cifar100/20251007_215344` \|
	\| beatrix-cifar100 \| 20251007_195812 \| 0.4701 \| 42 \| `weights/beatrix-cifar100/20251007_195812` \| `logs/beatrix-cifar100/20251007_195812` \|
	\| beatrix-cifar100 \| 20251008_002950 \| 0.4363 \| 49 \| `weights/beatrix-cifar100/20251008_002950` \| `logs/beatrix-cifar100/20251008_002950` \|
	\| beatrix-cifar100 \| 20251007_203741 \| 0.4324 \| 40 \| `weights/beatrix-cifar100/20251007_203741` \| `logs/beatrix-cifar100/20251007_203741` \|
	\| beatrix-simplex7-patch4-45d \| 20251008_010524 \| 0.2917 \| 95 \| `weights/beatrix-simplex7-patch4-45d/20251008_010524` \| `logs/beatrix-simplex7-patch4-45d/20251008_010524` \|
	\| beatrix-4simplex-45d \| 20251007_231008 \| 0.2916 \| 85 \| `weights/beatrix-4simplex-45d/20251007_231008` \| `logs/beatrix-4simplex-45d/20251007_231008` \|
	\| beatrix-cifar100 \| 20251007_193112 \| 0.2802 \| 10 \| `weights/beatrix-cifar100/20251007_193112` \| `N/A` \|
	\| beatrix-4simplex-45d \| 20251008_001147 \| 0.1382 \| 10 \| `weights/beatrix-4simplex-45d/20251008_001147` \| `logs/beatrix-4simplex-45d/20251008_001147` \|


	## Latest Updated Model: beatrix-simplex4-patch4-512d-flow (Session: 20251008_115206)

	### Model Details

	- Architecture: Vision Transformer with fractal positional encoding
	- Dataset: CIFAR-100 (100 classes)
	- Embedding Dimension: 512
	- Depth: 8 layers
	- Patch Size: 4x4
	- PE Levels: 12
	- Simplex Dimension: 4-simplex
	- Simplex Initialization: regular (scale=1.0)

	### Training Details

	- Training Session: 20251008_115206
	- Best Accuracy: 0.5674
	- Epochs Trained: 87
	- Batch Size: 512
	- Learning Rate: 0.0001
	- Adaptive Augmentation: Enabled

	### Loss Configuration

	- Task Loss Weight: 0.5
	- Flow Alignment Weight: 1.0
	- Coherence Weight: 0.3
	- Multi-Scale Weight: 0.2

	### TensorBoard Logs

	Training logs are available in the repository at:
	```
	logs/beatrix-simplex4-patch4-512d-flow/20251008_115206
	```

	To view them locally:
	```bash
	# Clone the repo
	git clone https://huggingface.co/AbstractPhil/vit-beatrix

	# View logs in TensorBoard
	tensorboard --logdir vit-beatrix/logs/beatrix-simplex4-patch4-512d-flow/20251008_115206
	```

	## Usage

	### Installation

	For Google Colab:
	```python
	# Install for Colab
	try:
	!pip uninstall -qy geometricvocab
	except:
	pass

	!pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git
	```

	For local environments:
	```bash
	# install the repo into your environment
	pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git
	```

	### Loading Models

	```python
	from geovocab2.train.model.core.vit_beatrix import SimplifiedGeometricClassifier
	from safetensors.torch import load_file
	from huggingface_hub import hf_hub_download
	import json

	# Download and view manifest to see all available models
	manifest_path = hf_hub_download(
	repo_id="AbstractPhil/vit-beatrix",
	filename="manifest.json"
	)

	with open(manifest_path, 'r') as f:
	manifest = json.load(f)

	# List all available models sorted by accuracy
	for key, info in sorted(manifest.items(), key=lambda x: x[1]['accuracy'], reverse=True):
	print(f"{info['model_name']} ({info['timestamp']}): {info['accuracy']:.4f}")

	# Download weights for the latest training session of beatrix-simplex4-patch4-512d-flow
	weights_path = hf_hub_download(
	repo_id="AbstractPhil/vit-beatrix",
	filename="weights/beatrix-simplex4-patch4-512d-flow/20251008_115206/model.safetensors"
	)

	# Load model
	model = SimplifiedGeometricClassifier(
	num_classes=100,
	img_size=32,
	embed_dim=512,
	depth=8
	)

	# Load weights
	state_dict = load_file(weights_path)
	model.load_state_dict(state_dict)
	model.eval()

	# Inference
	output = model(images)
	```

	## Citation

	```bibtex
	@misc{vit-beatrix,
	author = {AbstractPhil},
	title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices},
	year = {2025},
	url = {https://github.com/AbstractEyes/lattice_vocabulary}
	}
	```

	## License

	MIT License