mobiusnet / README.md
AbstractPhil's picture
Update README.md
c05ddb1 verified
---
license: apache-2.0
---
# MobiusNet
A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating.
## Overview
MobiusNet introduces a fundamentally different approach to neural network design:
- **MobiusLens**: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU)
- **Thirds Mask**: Cantor-inspired fractal channel suppression for regularization
- **Continuous Topology**: Layers sample a continuous manifold via the `t` parameter, not discrete units
- **Twist Rotations**: Smooth rotation through representation space across network depth
- **Integrator**: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity.
## Performance
| Model | Params | GFLOPs | Tiny ImageNet |
|-------|--------|--------|---------------|
| MobiusNet-Base | 33.7M | 2.69 | TBD |
## Installation
```bash
pip install torch torchvision safetensors huggingface_hub tensorboard tqdm
```
## Quick Start
### Training
```python
from mobius_trainer_full import train_tiny_imagenet
model, best_acc = train_tiny_imagenet(
preset='mobius_base',
epochs=200,
lr=3e-4,
batch_size=128,
use_integrator=True,
data_dir='./data/tiny-imagenet-200',
output_dir='./outputs',
hf_repo='AbstractPhil/mobiusnet',
save_every_n_epochs=10,
upload_every_n_epochs=10,
)
```
### Continue from Checkpoint
```python
# From local directory
model, best_acc = train_tiny_imagenet(
preset='mobius_base',
epochs=200,
continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)
# From HuggingFace (auto-downloads)
model, best_acc = train_tiny_imagenet(
preset='mobius_base',
epochs=200,
continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)
```
### Inference
```python
from safetensors.torch import load_file
from mobius_trainer_full import MobiusNet, PRESETS
# Load model
config = PRESETS['mobius_base']
model = MobiusNet(num_classes=200, use_integrator=True, **config)
state_dict = load_file("best_model.safetensors")
model.load_state_dict(state_dict)
model.eval()
# Inference
with torch.no_grad():
logits = model(image_tensor)
pred = logits.argmax(1)
```
## Model Presets
| Preset | Channels | Depths | ~Params |
|--------|----------|--------|---------|
| `mobius_tiny_s` | (64, 128, 256) | (2, 2, 2) | 500K |
| `mobius_tiny_m` | (64, 128, 256, 512, 768) | (2, 2, 4, 2, 2) | 11M |
| `mobius_tiny_l` | (96, 192, 384, 768) | (3, 3, 3, 3) | 8M |
| `mobius_base` | (128, 256, 512, 768, 1024) | (2, 2, 2, 2, 2) | 33.7M |
## Architecture
```
Input
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Stem (Conv β†’ BN) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Stage 1-N β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ MobiusConvBlock (Γ—depth) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Depthwise-Sep Conv β”‚ β”‚
β”‚ β”‚ β”œβ”€ BatchNorm β”‚ β”‚
β”‚ β”‚ β”œβ”€ MobiusLens (wave gate) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Thirds Mask β”‚ β”‚
β”‚ β”‚ └─ Learned Residual β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ Downsample (stride-2 conv) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Integrator (Conv β†’ BN β†’ GELU) β”‚ ← Task collapse
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Pool β†’ Linear β†’ Classes β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Core Components
### MobiusLens
Wave-based gating mechanism with three interference paths:
```python
L = wave(phase_l, drift_l) # Left path (+1 drift)
M = wave(phase_m, drift_m) # Middle path (0 drift, ghost)
R = wave(phase_r, drift_r) # Right path (-1 drift)
# Interference
xor_comp = |L + R - 2*L*R| # Differentiable XOR
and_comp = L * R # Differentiable AND
# Gating
gate = weighted_sum(L, M, R) * interference_blend
output = input * sigmoid(layernorm(gate))
```
The middle path (M) acts as a "ghost" β€” present but diminished β€” maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure).
### Thirds Mask
Rotating channel suppression inspired by Cantor set construction:
```
Layer 0: suppress channels [0:C/3]
Layer 1: suppress channels [C/3:2C/3]
Layer 2: suppress channels [2C/3:C]
Layer 3: back to [0:C/3]
```
Forces redundancy and prevents co-adaptation across channel groups.
### Continuous Topology
Each layer samples a continuous manifold:
```python
t = layer_idx / (total_layers - 1) # 0 β†’ 1
twist_in_angle = t * Ο€
twist_out_angle = -t * Ο€
scales = scale_range[0] + t * scale_span
```
Adding layers = finer sampling of the same underlying structure.
## Checkpoints
Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/`
```
β”œβ”€β”€ config.json
β”œβ”€β”€ best_accuracy.json
β”œβ”€β”€ final_accuracy.json
β”œβ”€β”€ checkpoints/
β”‚ β”œβ”€β”€ checkpoint_epoch_0010.pt
β”‚ β”œβ”€β”€ checkpoint_epoch_0010.safetensors
β”‚ β”œβ”€β”€ best_model.pt
β”‚ β”œβ”€β”€ best_model.safetensors
β”‚ β”œβ”€β”€ final_model.pt
β”‚ └── final_model.safetensors
└── tensorboard/
```
## TensorBoard
Monitor training:
```bash
tensorboard --logdir ./outputs/checkpoints
```
Tracks:
- Loss, train/val accuracy
- Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights)
- Residual weights
- Weight histograms
## Data Setup
### Tiny ImageNet
```bash
wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
unzip tiny-imagenet-200.zip -d ./data/
```
## License
Apache 2.0
## Citation
```bibtex
@misc{mobiusnet2026,
title={MobiusNet: Wave-Based Topological Vision Architecture},
author={AbstractPhil},
year={2026},
url={https://huggingface.co/AbstractPhil/mobiusnet}
}
```