metadata
tags:
- image-classification
- cantor-fusion
- geometric-deep-learning
- safetensors
- vision-transformer
- warm-restarts
library_name: pytorch
datasets:
- cifar10
- cifar100
metrics:
- accuracy
vit-beans-v3
Geometric Deep Learning with Cantor Multihead Fusion + AdamW Warm Restarts
This repository contains multiple training runs using Cantor fusion architecture with pentachoron structures, geometric routing, and CosineAnnealingWarmRestarts for automatic exploration cycles.
Training Strategy: AdamW + Warm Restarts
This model uses AdamW with Cosine Annealing Warm Restarts (SGDR):
- Drop phase: LR decays from 0.0001 β 1e-07 over 40 epochs
- Restart phase: LR jumps back to 0.0001 to explore new regions
- Cycle multiplier: Each cycle is 1.5x longer than previous
- Benefits: Automatic exploration + exploitation, finds better minima, robust training
π LR Boost at Restarts (NEW!)
This run uses restart_lr_mult = 1.25x:
- Normal restart: 3e-4 β 1e-7 β restart at 3e-4
- Boosted restart: 3e-4 β 1e-7 β restart at 1.25e-04 (1.25x!)
- Creates wider exploration curves to escape solidified local minima
- Each restart provides progressively stronger exploration boost
Restart Schedule
Epochs 0-40: LR: 0.0001 β 1e-07 (first cycle)
Epoch 40: LR: RESTART to 0.000125 π
Epochs 40-100.0: LR: 0.000125 β 1e-07 (longer cycle)
...
Current Run
Latest: cifar100_consciousness_ADAMW_WarmRestart_boost1.25x_20251122_025019
- Dataset: CIFAR100
- Fusion Mode: consciousness
- Optimizer: AdamW (adaptive moments)
- Scheduler: CosineAnnealingWarmRestarts
- Restart LR Mult: 1.25x
- Architecture: 4 blocks, 4 heads
- Simplex: 4-simplex (5 vertices)
Architecture
The Cantor Fusion architecture uses:
- Geometric Routing: Pentachoron (5-simplex) structures for token routing
- Cantor Multihead Fusion: Multiple fusion heads with geometric attention
- Beatrix Consciousness Routing: Optional consciousness-aware token fusion
- SafeTensors Format: All model weights use SafeTensors (not pickle)
Usage
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
model_path = hf_hub_download(
repo_id="AbstractPhil/vit-beans-v3",
filename="runs/YOUR_RUN_NAME/checkpoints/best_model.safetensors"
)
state_dict = load_file(model_path)
model.load_state_dict(state_dict)
Citation
@misc{vit_beans_v3,
author = {AbstractPhil},
title = {vit-beans-v3: Geometric Deep Learning with Warm Restarts},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/AbstractPhil/vit-beans-v3}
}
Repository maintained by: @AbstractPhil
Latest update: 2025-11-22 02:50:22