|
|
--- |
|
|
tags: |
|
|
- vision |
|
|
- image-classification |
|
|
- fractal-positional-encoding |
|
|
- geometric-deep-learning |
|
|
- devil-staircase |
|
|
- simplex-geometry |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer |
|
|
|
|
|
This repository contains Vision Transformers integrating Devil's Staircase positional encoding |
|
|
with geometric simplex features for vision tasks. |
|
|
|
|
|
## Key Features |
|
|
|
|
|
- **Fractal Positional Encoding**: Devil's Staircase multi-scale position embeddings |
|
|
- **Geometric Simplex Features**: k-simplex vertex computations from Cantor measure |
|
|
- **SimplexFactory Initialization**: Pre-initialized simplices with geometrically meaningful shapes (regular/random/uniform) |
|
|
- **Adaptive Augmentation**: Progressive augmentation escalation to prevent overfitting |
|
|
- **Beatrix Formula Suite**: Flow alignment, hierarchical coherence, and multi-scale consistency losses |
|
|
|
|
|
### Simplex Initialization |
|
|
|
|
|
Instead of random initialization, the model uses **SimplexFactory** to create geometrically sound starting configurations: |
|
|
|
|
|
- **Regular** (default): All edges equal length, perfectly balanced symmetric structure |
|
|
- **Random**: QR decomposition ensuring affine independence |
|
|
- **Uniform**: Hypercube sampling with perturbations |
|
|
|
|
|
Regular simplices provide the most stable and mathematically meaningful initialization, giving the model a better starting point for learning geometric features. |
|
|
|
|
|
### Adaptive Augmentation System |
|
|
|
|
|
The trainer includes an intelligent augmentation system that monitors train/validation accuracy gap and progressively enables more augmentation: |
|
|
|
|
|
1. **Baseline**: RandomCrop + RandomHorizontalFlip |
|
|
2. **Stage 1**: + ColorJitter |
|
|
3. **Stage 2**: + RandomRotation |
|
|
4. **Stage 3**: + RandomAffine |
|
|
5. **Stage 4**: + RandomErasing |
|
|
6. **Stage 5**: + AutoAugment (CIFAR policy) |
|
|
7. **Stage 6**: Enable Mixup (α=0.2) |
|
|
8. **Stage 7**: Enable CutMix (α=1.0) - Final stage |
|
|
|
|
|
When train accuracy exceeds validation accuracy by 2% or more, the system automatically escalates to the next augmentation stage. |
|
|
|
|
|
## Available Models (Best Checkpoints Only) |
|
|
|
|
|
| Model Name | Training Session | Accuracy | Epoch | Weights Path | Logs Path | |
|
|
|------------|------------------|----------|-------|--------------|----------| |
|
|
| beatrix-cifar100 | 20251007_182851 | 0.5819 | 42 | `weights/beatrix-cifar100/20251007_182851` | `N/A` | |
|
|
| beatrix-simplex4-patch4-512d-flow | 20251008_115206 | 0.5674 | 87 | `weights/beatrix-simplex4-patch4-512d-flow/20251008_115206` | `logs/beatrix-simplex4-patch4-512d-flow/20251008_115206` | |
|
|
| beatrix-simplex7-patch4-256d-ce | 20251008_034231 | 0.5372 | 77 | `weights/beatrix-simplex7-patch4-256d-ce/20251008_034231` | `logs/beatrix-simplex7-patch4-256d-ce/20251008_034231` | |
|
|
| beatrix-simplex7-patch4-256d | 20251008_020048 | 0.5291 | 89 | `weights/beatrix-simplex7-patch4-256d/20251008_020048` | `logs/beatrix-simplex7-patch4-256d/20251008_020048` | |
|
|
| beatrix-cifar100 | 20251007_215344 | 0.5161 | 41 | `weights/beatrix-cifar100/20251007_215344` | `logs/beatrix-cifar100/20251007_215344` | |
|
|
| beatrix-cifar100 | 20251007_195812 | 0.4701 | 42 | `weights/beatrix-cifar100/20251007_195812` | `logs/beatrix-cifar100/20251007_195812` | |
|
|
| beatrix-cifar100 | 20251008_002950 | 0.4363 | 49 | `weights/beatrix-cifar100/20251008_002950` | `logs/beatrix-cifar100/20251008_002950` | |
|
|
| beatrix-cifar100 | 20251007_203741 | 0.4324 | 40 | `weights/beatrix-cifar100/20251007_203741` | `logs/beatrix-cifar100/20251007_203741` | |
|
|
| beatrix-simplex7-patch4-45d | 20251008_010524 | 0.2917 | 95 | `weights/beatrix-simplex7-patch4-45d/20251008_010524` | `logs/beatrix-simplex7-patch4-45d/20251008_010524` | |
|
|
| beatrix-4simplex-45d | 20251007_231008 | 0.2916 | 85 | `weights/beatrix-4simplex-45d/20251007_231008` | `logs/beatrix-4simplex-45d/20251007_231008` | |
|
|
| beatrix-cifar100 | 20251007_193112 | 0.2802 | 10 | `weights/beatrix-cifar100/20251007_193112` | `N/A` | |
|
|
| beatrix-4simplex-45d | 20251008_001147 | 0.1382 | 10 | `weights/beatrix-4simplex-45d/20251008_001147` | `logs/beatrix-4simplex-45d/20251008_001147` | |
|
|
|
|
|
|
|
|
## Latest Updated Model: beatrix-simplex4-patch4-512d-flow (Session: 20251008_115206) |
|
|
|
|
|
### Model Details |
|
|
|
|
|
- **Architecture**: Vision Transformer with fractal positional encoding |
|
|
- **Dataset**: CIFAR-100 (100 classes) |
|
|
- **Embedding Dimension**: 512 |
|
|
- **Depth**: 8 layers |
|
|
- **Patch Size**: 4x4 |
|
|
- **PE Levels**: 12 |
|
|
- **Simplex Dimension**: 4-simplex |
|
|
- **Simplex Initialization**: regular (scale=1.0) |
|
|
|
|
|
### Training Details |
|
|
|
|
|
- **Training Session**: 20251008_115206 |
|
|
- **Best Accuracy**: 0.5674 |
|
|
- **Epochs Trained**: 87 |
|
|
- **Batch Size**: 512 |
|
|
- **Learning Rate**: 0.0001 |
|
|
- **Adaptive Augmentation**: Enabled |
|
|
|
|
|
### Loss Configuration |
|
|
|
|
|
- Task Loss Weight: 0.5 |
|
|
- Flow Alignment Weight: 1.0 |
|
|
- Coherence Weight: 0.3 |
|
|
- Multi-Scale Weight: 0.2 |
|
|
|
|
|
### TensorBoard Logs |
|
|
|
|
|
Training logs are available in the repository at: |
|
|
``` |
|
|
logs/beatrix-simplex4-patch4-512d-flow/20251008_115206 |
|
|
``` |
|
|
|
|
|
To view them locally: |
|
|
```bash |
|
|
# Clone the repo |
|
|
git clone https://huggingface.co/AbstractPhil/vit-beatrix |
|
|
|
|
|
# View logs in TensorBoard |
|
|
tensorboard --logdir vit-beatrix/logs/beatrix-simplex4-patch4-512d-flow/20251008_115206 |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
For Google Colab: |
|
|
```python |
|
|
# Install for Colab |
|
|
try: |
|
|
!pip uninstall -qy geometricvocab |
|
|
except: |
|
|
pass |
|
|
|
|
|
!pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git |
|
|
``` |
|
|
|
|
|
For local environments: |
|
|
```bash |
|
|
# install the repo into your environment |
|
|
pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git |
|
|
``` |
|
|
|
|
|
### Loading Models |
|
|
|
|
|
```python |
|
|
from geovocab2.train.model.core.vit_beatrix import SimplifiedGeometricClassifier |
|
|
from safetensors.torch import load_file |
|
|
from huggingface_hub import hf_hub_download |
|
|
import json |
|
|
|
|
|
# Download and view manifest to see all available models |
|
|
manifest_path = hf_hub_download( |
|
|
repo_id="AbstractPhil/vit-beatrix", |
|
|
filename="manifest.json" |
|
|
) |
|
|
|
|
|
with open(manifest_path, 'r') as f: |
|
|
manifest = json.load(f) |
|
|
|
|
|
# List all available models sorted by accuracy |
|
|
for key, info in sorted(manifest.items(), key=lambda x: x[1]['accuracy'], reverse=True): |
|
|
print(f"{info['model_name']} ({info['timestamp']}): {info['accuracy']:.4f}") |
|
|
|
|
|
# Download weights for the latest training session of beatrix-simplex4-patch4-512d-flow |
|
|
weights_path = hf_hub_download( |
|
|
repo_id="AbstractPhil/vit-beatrix", |
|
|
filename="weights/beatrix-simplex4-patch4-512d-flow/20251008_115206/model.safetensors" |
|
|
) |
|
|
|
|
|
# Load model |
|
|
model = SimplifiedGeometricClassifier( |
|
|
num_classes=100, |
|
|
img_size=32, |
|
|
embed_dim=512, |
|
|
depth=8 |
|
|
) |
|
|
|
|
|
# Load weights |
|
|
state_dict = load_file(weights_path) |
|
|
model.load_state_dict(state_dict) |
|
|
model.eval() |
|
|
|
|
|
# Inference |
|
|
output = model(images) |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{vit-beatrix, |
|
|
author = {AbstractPhil}, |
|
|
title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices}, |
|
|
year = {2025}, |
|
|
url = {https://github.com/AbstractEyes/lattice_vocabulary} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|