| | --- |
| | tags: |
| | - vision |
| | - image-classification |
| | - fractal-positional-encoding |
| | - geometric-deep-learning |
| | - devil-staircase |
| | - simplex-geometry |
| | license: mit |
| | --- |
| | |
| | # ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer |
| |
|
| | This repository contains Vision Transformers integrating Devil's Staircase positional encoding |
| | with geometric simplex features for vision tasks. |
| |
|
| | ## Key Features |
| |
|
| | - **Fractal Positional Encoding**: Devil's Staircase multi-scale position embeddings |
| | - **Geometric Simplex Features**: k-simplex vertex computations from Cantor measure |
| | - **SimplexFactory Initialization**: Pre-initialized simplices with geometrically meaningful shapes (regular/random/uniform) |
| | - **Adaptive Augmentation**: Progressive augmentation escalation to prevent overfitting |
| | - **Beatrix Formula Suite**: Flow alignment, hierarchical coherence, and multi-scale consistency losses |
| |
|
| | ### Simplex Initialization |
| |
|
| | Instead of random initialization, the model uses **SimplexFactory** to create geometrically sound starting configurations: |
| |
|
| | - **Regular** (default): All edges equal length, perfectly balanced symmetric structure |
| | - **Random**: QR decomposition ensuring affine independence |
| | - **Uniform**: Hypercube sampling with perturbations |
| |
|
| | Regular simplices provide the most stable and mathematically meaningful initialization, giving the model a better starting point for learning geometric features. |
| |
|
| | ### Adaptive Augmentation System |
| |
|
| | The trainer includes an intelligent augmentation system that monitors train/validation accuracy gap and progressively enables more augmentation: |
| |
|
| | 1. **Baseline**: RandomCrop + RandomHorizontalFlip |
| | 2. **Stage 1**: + ColorJitter |
| | 3. **Stage 2**: + RandomRotation |
| | 4. **Stage 3**: + RandomAffine |
| | 5. **Stage 4**: + RandomErasing |
| | 6. **Stage 5**: + AutoAugment (CIFAR policy) |
| | 7. **Stage 6**: Enable Mixup (α=0.2) |
| | 8. **Stage 7**: Enable CutMix (α=1.0) - Final stage |
| |
|
| | When train accuracy exceeds validation accuracy by 2% or more, the system automatically escalates to the next augmentation stage. |
| |
|
| | ## Available Models (Best Checkpoints Only) |
| |
|
| | | Model Name | Training Session | Accuracy | Epoch | Weights Path | Logs Path | |
| | |------------|------------------|----------|-------|--------------|----------| |
| | | beatrix-cifar100 | 20251007_182851 | 0.5819 | 42 | `weights/beatrix-cifar100/20251007_182851` | `N/A` | |
| | | beatrix-cifar100 | 20251007_215344 | 0.5161 | 41 | `weights/beatrix-cifar100/20251007_215344` | `logs/beatrix-cifar100/20251007_215344` | |
| | | beatrix-cifar100 | 20251007_195812 | 0.4701 | 42 | `weights/beatrix-cifar100/20251007_195812` | `logs/beatrix-cifar100/20251007_195812` | |
| | | beatrix-cifar100 | 20251008_002950 | 0.4363 | 49 | `weights/beatrix-cifar100/20251008_002950` | `logs/beatrix-cifar100/20251008_002950` | |
| | | beatrix-cifar100 | 20251007_203741 | 0.4324 | 40 | `weights/beatrix-cifar100/20251007_203741` | `logs/beatrix-cifar100/20251007_203741` | |
| | | beatrix-simplex7-patch4-45d | 20251008_010524 | 0.2917 | 95 | `weights/beatrix-simplex7-patch4-45d/20251008_010524` | `logs/beatrix-simplex7-patch4-45d/20251008_010524` | |
| | | beatrix-4simplex-45d | 20251007_231008 | 0.2916 | 85 | `weights/beatrix-4simplex-45d/20251007_231008` | `logs/beatrix-4simplex-45d/20251007_231008` | |
| | | beatrix-cifar100 | 20251007_193112 | 0.2802 | 10 | `weights/beatrix-cifar100/20251007_193112` | `N/A` | |
| | | beatrix-4simplex-45d | 20251008_001147 | 0.1382 | 10 | `weights/beatrix-4simplex-45d/20251008_001147` | `logs/beatrix-4simplex-45d/20251008_001147` | |
| | | beatrix-simplex7-patch4-256d | 20251008_020048 | 0.0552 | 0 | `weights/beatrix-simplex7-patch4-256d/20251008_020048` | `logs/beatrix-simplex7-patch4-256d/20251008_020048` | |
| |
|
| |
|
| | ## Latest Updated Model: beatrix-simplex7-patch4-256d (Session: 20251008_020048) |
| | |
| | ### Model Details |
| | |
| | - **Architecture**: Vision Transformer with fractal positional encoding |
| | - **Dataset**: CIFAR-100 (100 classes) |
| | - **Embedding Dimension**: 256 |
| | - **Depth**: 22 layers |
| | - **Patch Size**: 4x4 |
| | - **PE Levels**: 12 |
| | - **Simplex Dimension**: 7-simplex |
| | - **Simplex Initialization**: regular (scale=1.0) |
| | |
| | ### Training Details |
| | |
| | - **Training Session**: 20251008_020048 |
| | - **Best Accuracy**: 0.0552 |
| | - **Epochs Trained**: 0 |
| | - **Batch Size**: 512 |
| | - **Learning Rate**: 0.0001 |
| | - **Adaptive Augmentation**: Enabled |
| |
|
| | ### Loss Configuration |
| |
|
| | - Task Loss Weight: 0.5 |
| | - Flow Alignment Weight: 0.5 |
| | - Coherence Weight: 0.3 |
| | - Multi-Scale Weight: 0.2 |
| |
|
| | ### TensorBoard Logs |
| |
|
| | Training logs are available in the repository at: |
| | ``` |
| | logs/beatrix-simplex7-patch4-256d/20251008_020048 |
| | ``` |
| |
|
| | To view them locally: |
| | ```bash |
| | # Clone the repo |
| | git clone https://huggingface.co/AbstractPhil/vit-beatrix |
| | |
| | # View logs in TensorBoard |
| | tensorboard --logdir vit-beatrix/logs/beatrix-simplex7-patch4-256d/20251008_020048 |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| |
|
| | For Google Colab: |
| | ```python |
| | # Install for Colab |
| | try: |
| | !pip uninstall -qy geometricvocab |
| | except: |
| | pass |
| | |
| | !pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git |
| | ``` |
| |
|
| | For local environments: |
| | ```bash |
| | # install the repo into your environment |
| | pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git |
| | ``` |
| |
|
| | ### Loading Models |
| |
|
| | ```python |
| | from geovocab2.train.model.vit_beatrix import SimplifiedGeometricClassifier |
| | from safetensors.torch import load_file |
| | from huggingface_hub import hf_hub_download |
| | import json |
| | |
| | # Download and view manifest to see all available models |
| | manifest_path = hf_hub_download( |
| | repo_id="AbstractPhil/vit-beatrix", |
| | filename="manifest.json" |
| | ) |
| | |
| | with open(manifest_path, 'r') as f: |
| | manifest = json.load(f) |
| | |
| | # List all available models sorted by accuracy |
| | for key, info in sorted(manifest.items(), key=lambda x: x[1]['accuracy'], reverse=True): |
| | print(f"{info['model_name']} ({info['timestamp']}): {info['accuracy']:.4f}") |
| | |
| | # Download weights for the latest training session of beatrix-simplex7-patch4-256d |
| | weights_path = hf_hub_download( |
| | repo_id="AbstractPhil/vit-beatrix", |
| | filename="weights/beatrix-simplex7-patch4-256d/20251008_020048/model.safetensors" |
| | ) |
| | |
| | # Load model |
| | model = SimplifiedGeometricClassifier( |
| | num_classes=100, |
| | img_size=32, |
| | embed_dim=256, |
| | depth=22 |
| | ) |
| | |
| | # Load weights |
| | state_dict = load_file(weights_path) |
| | model.load_state_dict(state_dict) |
| | model.eval() |
| | |
| | # Inference |
| | output = model(images) |
| | ``` |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{vit-beatrix, |
| | author = {AbstractPhil}, |
| | title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices}, |
| | year = {2025}, |
| | url = {https://github.com/AbstractEyes/lattice_vocabulary} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | MIT License |
| |
|