vit-beatrix / README.md
AbstractPhil's picture
Update best model - Epoch 2 - Acc: 0.1820
7b60d9f verified
|
raw
history blame
1.61 kB
metadata
tags:
  - vision
  - image-classification
  - fractal-positional-encoding
  - geometric-deep-learning
  - devil-staircase
  - simplex-geometry
license: mit

ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer

This model integrates Devil's Staircase positional encoding with geometric simplex features for vision tasks. Trained on CIFAR-10.

Model Details

  • Architecture: Vision Transformer with fractal positional encoding
  • Dataset: CIFAR-100 (100 classes)
  • Embedding Dimension: 512
  • Depth: 4 layers
  • Patch Size: 4x4
  • PE Levels: 12
  • Simplex Dimension: 5-simplex

Training

  • Dataset: CIFAR-100
  • Epochs: 2
  • Best Accuracy: 0.1820
  • Batch Size: 512
  • Learning Rate: 0.001

Loss Components

  • Task Loss Weight: 1.0
  • Flow Alignment Weight: 0.5
  • Coherence Weight: 0.3
  • Multi-Scale Weight: 0.2

Usage

from geovocab2.train.model.vit_beatrix import SimplifiedGeometricClassifier
from safetensors.torch import load_file

# Load model
model = SimplifiedGeometricClassifier(
    num_classes=100,  # CIFAR-100
    img_size=32,
    embed_dim=512,
    depth=4
)

# Load weights (renamed from model_best.safetensors to model.safetensors in Hub)
state_dict = load_file("model.safetensors")
model.load_state_dict(state_dict)
model.eval()

# Inference
output = model(images)

Citation

@misc{vit-beatrix,
  author = {AbstractPhil},
  title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices},
  year = {2025},
  url = {https://github.com/AbstractEyes/lattice_vocabulary}
}

License

MIT License