File size: 6,588 Bytes

---
license: apache-2.0
---
# MobiusNet

A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating.

## Overview

MobiusNet introduces a fundamentally different approach to neural network design:

- **MobiusLens**: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU)
- **Thirds Mask**: Cantor-inspired fractal channel suppression for regularization
- **Continuous Topology**: Layers sample a continuous manifold via the `t` parameter, not discrete units
- **Twist Rotations**: Smooth rotation through representation space across network depth
- **Integrator**: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity.

## Performance

| Model | Params | GFLOPs | Tiny ImageNet |
|-------|--------|--------|---------------|
| MobiusNet-Base | 33.7M | 2.69 | TBD |

## Installation

```bash
pip install torch torchvision safetensors huggingface_hub tensorboard tqdm
```

## Quick Start

### Training

```python
from mobius_trainer_full import train_tiny_imagenet

model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    lr=3e-4,
    batch_size=128,
    use_integrator=True,
    data_dir='./data/tiny-imagenet-200',
    output_dir='./outputs',
    hf_repo='AbstractPhil/mobiusnet',
    save_every_n_epochs=10,
    upload_every_n_epochs=10,
)
```

### Continue from Checkpoint

```python
# From local directory
model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)

# From HuggingFace (auto-downloads)
model, best_acc = train_tiny_imagenet(
    preset='mobius_base',
    epochs=200,
    continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000",
)
```

### Inference

```python
from safetensors.torch import load_file
from mobius_trainer_full import MobiusNet, PRESETS

# Load model
config = PRESETS['mobius_base']
model = MobiusNet(num_classes=200, use_integrator=True, **config)
state_dict = load_file("best_model.safetensors")
model.load_state_dict(state_dict)
model.eval()

# Inference
with torch.no_grad():
    logits = model(image_tensor)
    pred = logits.argmax(1)
```

## Model Presets

| Preset | Channels | Depths | ~Params |
|--------|----------|--------|---------|
| `mobius_tiny_s` | (64, 128, 256) | (2, 2, 2) | 500K |
| `mobius_tiny_m` | (64, 128, 256, 512, 768) | (2, 2, 4, 2, 2) | 11M |
| `mobius_tiny_l` | (96, 192, 384, 768) | (3, 3, 3, 3) | 8M |
| `mobius_base` | (128, 256, 512, 768, 1024) | (2, 2, 2, 2, 2) | 33.7M |

## Architecture

```
Input
  │
  ▼
┌─────────────────────────────────┐
│ Stem (Conv → BN)                │
└─────────────────────────────────┘
  │
  ▼
┌─────────────────────────────────┐
│ Stage 1-N                       │
│ ┌─────────────────────────────┐ │
│ │ MobiusConvBlock (×depth)    │ │
│ │  ├─ Depthwise-Sep Conv      │ │
│ │  ├─ BatchNorm               │ │
│ │  ├─ MobiusLens (wave gate)  │ │
│ │  ├─ Thirds Mask             │ │
│ │  └─ Learned Residual        │ │
│ └─────────────────────────────┘ │
│ Downsample (stride-2 conv)      │
└─────────────────────────────────┘
  │
  ▼
┌─────────────────────────────────┐
│ Integrator (Conv → BN → GELU)   │  ← Task collapse
└─────────────────────────────────┘
  │
  ▼
┌─────────────────────────────────┐
│ Pool → Linear → Classes         │
└─────────────────────────────────┘
```

## Core Components

### MobiusLens

Wave-based gating mechanism with three interference paths:

```python
L = wave(phase_l, drift_l)   # Left path  (+1 drift)
M = wave(phase_m, drift_m)   # Middle path (0 drift, ghost)
R = wave(phase_r, drift_r)   # Right path (-1 drift)

# Interference
xor_comp = |L + R - 2*L*R|   # Differentiable XOR
and_comp = L * R              # Differentiable AND

# Gating
gate = weighted_sum(L, M, R) * interference_blend
output = input * sigmoid(layernorm(gate))
```

The middle path (M) acts as a "ghost" — present but diminished — maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure).

### Thirds Mask

Rotating channel suppression inspired by Cantor set construction:

```
Layer 0: suppress channels [0:C/3]
Layer 1: suppress channels [C/3:2C/3]
Layer 2: suppress channels [2C/3:C]
Layer 3: back to [0:C/3]
```

Forces redundancy and prevents co-adaptation across channel groups.

### Continuous Topology

Each layer samples a continuous manifold:

```python
t = layer_idx / (total_layers - 1)  # 0 → 1

twist_in_angle = t * π
twist_out_angle = -t * π
scales = scale_range[0] + t * scale_span
```

Adding layers = finer sampling of the same underlying structure.

## Checkpoints

Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/`

```
├── config.json
├── best_accuracy.json
├── final_accuracy.json
├── checkpoints/
│   ├── checkpoint_epoch_0010.pt
│   ├── checkpoint_epoch_0010.safetensors
│   ├── best_model.pt
│   ├── best_model.safetensors
│   ├── final_model.pt
│   └── final_model.safetensors
└── tensorboard/
```

## TensorBoard

Monitor training:

```bash
tensorboard --logdir ./outputs/checkpoints
```

Tracks:
- Loss, train/val accuracy
- Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights)
- Residual weights
- Weight histograms

## Data Setup

### Tiny ImageNet

```bash
wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
unzip tiny-imagenet-200.zip -d ./data/
```

## License

Apache 2.0

## Citation

```bibtex
@misc{mobiusnet2026,
  title={MobiusNet: Wave-Based Topological Vision Architecture},
  author={AbstractPhil},
  year={2026},
  url={https://huggingface.co/AbstractPhil/mobiusnet}
}
```