vit-beans-v3 / README.md
AbstractPhil's picture
Upload README.md with huggingface_hub
972f6e7 verified
|
raw
history blame
2.45 kB
metadata
tags:
  - image-classification
  - cantor-fusion
  - geometric-deep-learning
  - safetensors
  - vision-transformer
  - warm-restarts
library_name: pytorch
datasets:
  - cifar10
  - cifar100
metrics:
  - accuracy

vit-beans-v3

Geometric Deep Learning with Cantor Multihead Fusion + AdamW Warm Restarts

This repository contains multiple training runs using Cantor fusion architecture with pentachoron structures, geometric routing, and CosineAnnealingWarmRestarts for automatic exploration cycles.

Training Strategy: AdamW + Warm Restarts

This model uses AdamW with Cosine Annealing Warm Restarts (SGDR):

  • Drop phase: LR decays from 0.0001 → 1e-07 over 20 epochs
  • Restart phase: LR jumps back to 0.0001 to explore new regions
  • Cycle multiplier: Each cycle is 1x longer than previous
  • Benefits: Automatic exploration + exploitation, finds better minima, robust training

Restart Schedule

Epochs 0-20:   LR: 0.0001 → 1e-07 (first cycle)
Epoch 20:      LR: RESTART to 0.0001 🔄
Epochs 20-40: LR: 0.0001 → 1e-07 (longer cycle)
...

Current Run

Latest: cifar100_consciousness_ADAMW_WarmRestart_20251120_030614

  • Dataset: CIFAR100
  • Fusion Mode: consciousness
  • Optimizer: AdamW (adaptive moments)
  • Scheduler: CosineAnnealingWarmRestarts
  • Architecture: 4 blocks, 32 heads
  • Simplex: 4-simplex (5 vertices)

Architecture

The Cantor Fusion architecture uses:

  • Geometric Routing: Pentachoron (5-simplex) structures for token routing
  • Cantor Multihead Fusion: Multiple fusion heads with geometric attention
  • Beatrix Consciousness Routing: Optional consciousness-aware token fusion
  • SafeTensors Format: All model weights use SafeTensors (not pickle)

Usage

from huggingface_hub import hf_hub_download
from safetensors.torch import load_file

model_path = hf_hub_download(
    repo_id="AbstractPhil/vit-beans-v3",
    filename="runs/YOUR_RUN_NAME/checkpoints/best_model.safetensors"
)

state_dict = load_file(model_path)
model.load_state_dict(state_dict)

Citation

@misc{vit_beans_v3,
  author = {AbstractPhil},
  title = {vit-beans-v3: Geometric Deep Learning with Warm Restarts},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/AbstractPhil/vit-beans-v3}
}

Repository maintained by: @AbstractPhil

Latest update: 2025-11-20 03:06:18