AbstractPhil
/

geo-beatrix-resnet

+---
+license: mit
+tags:
+- image-classification
+- cifar100
+- geometric-learning
+- fractal-encoding
+- in-training
+- no-attention
+- no-cross-entropy
+datasets:
+- cifar100
+metrics:
+- accuracy
+library_name: pytorch
+pipeline_tag: image-classification
+model-index:
+- name: geo-beatrix-resnet18
+  results:
+  - task:
+      type: image-classification
+      name: Image Classification
+    dataset:
+      name: CIFAR-100
+      type: cifar100
+    metrics:
+    - type: accuracy
+      value: 44.56
+      name: Test Accuracy
+      verified: false
+---
+# geo-beatrix-resnet18
+**Geometric Basin Classification for CIFAR-100**
+🚧 **Training in Progress** 🚧
+Current Status: Epoch 50/200
+---
+## Current Performance
+| Metric | Value |
+|--------|-------|
+| **Best Test Accuracy** | **44.56%** |
+| **Best Epoch** | 50 |
+| **Current Train Accuracy** | 44.41% |
+| **Current Test Accuracy** | 44.56% |
+| **Current α (Cantor param)** | 0.4306 |
+| **Total Parameters** | 11,952,641 |
+| **Training Time** | 0:07:29 |
+### All Training Runs
+| Timestamp | Status | Best Epoch | Test Acc | Train Acc | α |
+|-----------|--------|------------|----------|-----------|---|
+| `20251010_185133` | 🔄 | 50 | **44.56%** | 44.41% | 0.4306 |
+### Comparison to State-of-the-Art
+| Model | Accuracy | Status |
+|-------|----------|--------|
+| **geo-beatrix (this model)** | **44.56%** | 🔄 Training |
+| vit-beatrix-dualstream | 66.0% | Vision Transformer + Cross-Entropy |
+🎯 **Current target**: Beat vit-beatrix (66.0%) - Currently -21.44%
+---
+## Architecture
+- **Base**: ResNet18 (torchvision)
+- **Pretrained**: From scratch
+- **Features**: 512-dim from ResNet18
+- **Positional Encoding**: Devil's Staircase (Cantor function, 1883)
+- **PE Levels**: 18
+- **PE Features/Level**: 100
+- **Classification**: Geometric Basin Compatibility (NO cross-entropy)
+- **Attention Mechanisms**: NONE
+- **Mixing**: Fractal (triadic multi-patch)
+---
+## Training Configuration
+```json
+{
+  "model_name": "geo-beatrix-resnet18",
+  "model_type": "geometric_basin_classifier",
+  "num_classes": 100,
+  "batch_size": 512,
+  "num_epochs": 200,
+  "base_learning_rate": 0.002,
+  "weight_decay": 0.05,
+  "warmup_epochs": 10,
+  "pe_levels": 18,
+  "pe_features_per_level": 100,
+  "dropout": 0.1,
+  "pretrained_resnet": false,
+  "a100_optimizations": {
+    "mixed_precision": true,
+    "torch_compile": false,
+    "channels_last": true,
+    "gradient_checkpointing": false
+  },
+  "alphamix": {
+    "enabled": true,
+    "fractal_mode": true,
+    "range": [
+      0.3,
+      0.7
+    ],
+    "spatial_ratio": 0.25,
+    "curriculum_start": 0.0,
+    "curriculum_end": 0.5,
+    "fractal_steps": [
+      1,
+      3
+    ],
+    "fractal_scales": [
+      0.3333333333333333,
+      0.1111111111111111,
+      0.037037037037037035
+    ]
+  },
+  "architecture": "ResNet18 + Devil's Staircase PE",
+  "loss_function": "Geometric Basin Compatibility",
+  "cross_entropy": false,
+  "attention_mechanisms": false,
+  "timestamp": "20251010_185133"
+}
+```
+---
+## Files Structure
+```
+├── model.pt                 (BEST overall model - easy access!)
+├── model.safetensors        (BEST overall model - easy access!)
+├── best_model_info.json     (which epoch/run this came from)
+├── runs_history.json        (all training runs and their results)
+├── README.md
+├── weights/geo-beatrix-resnet18/20251010_185133/
+│   ├── model.pt                 (best from this training run)
+│   ├── model.safetensors        (best from this training run)
+│   ├── config.json
+│   ├── training_log.txt
+│   └── checkpoints/
+│       ├── checkpoint_epoch_50.safetensors
+│       ├── checkpoint_epoch_100.safetensors
+│       └── checkpoint_epoch_150.safetensors
+│       (snapshots every 10 epochs)
+└── runs/geo-beatrix-resnet18/20251010_185133/
+    ├── events.out.tfevents.*    (TensorBoard logs)
+    └── metrics.csv              (training metrics)
+```
+**Note**: The root `model.pt` and `model.safetensors` always contain the best model across all training runs!
+---
+## Usage
+```python
+from huggingface_hub import hf_hub_download
+import torch
+# EASIEST: Download BEST overall model from root (recommended!)
+from safetensors.torch import load_file
+model_path = hf_hub_download(
+    repo_id="AbstractPhil/geo-beatrix-resnet",
+    filename="model.safetensors"
+)
+state_dict = load_file(model_path)
+# model.load_state_dict(state_dict)
+# Check which epoch/run the best model came from
+info_path = hf_hub_download(
+    repo_id="AbstractPhil/geo-beatrix-resnet",
+    filename="best_model_info.json"
+)
+with open(info_path) as f:
+    best_info = json.load(f)
+    print(f"Best model: epoch {best_info['epoch']}, {best_info['test_accuracy']:.2f}%")
+# Or download from specific training run
+model_path = hf_hub_download(
+    repo_id="AbstractPhil/geo-beatrix-resnet",
+    filename="weights/geo-beatrix-resnet18/20251010_185133/model.safetensors"
+)
+# Download specific epoch checkpoint
+epoch_checkpoint = hf_hub_download(
+    repo_id="AbstractPhil/geo-beatrix-resnet",
+    filename="weights/geo-beatrix-resnet18/20251010_185133/checkpoints/checkpoint_epoch_100.safetensors"
+)
+```
+---
+## Training History
+### Best Checkpoint
+- Epoch: 50
+- Train Acc: 44.41%
+- Test Acc: 44.56%
+- Alpha: 0.4306
+- Loss: 1.4445
+### Latest 5 Epochs
+- **Epoch 46**: Train 44.08%, Test 0.00%, α=0.4274, Loss=1.5477
+- **Epoch 47**: Train 45.06%, Test 0.00%, α=0.4317, Loss=1.6100
+- **Epoch 48**: Train 44.08%, Test 0.00%, α=0.4306, Loss=1.5218
+- **Epoch 49**: Train 45.15%, Test 0.00%, α=0.4319, Loss=1.5274
+- **Epoch 50**: Train 44.41%, Test 44.56%, α=0.4306, Loss=1.4445
+### Training Milestones
+- 📊 **α ≥ 0.40** reached at epoch 10
+---
+## Innovation
+✅ **NO attention mechanisms**
+✅ **NO cross-entropy loss**
+✅ **Fractal positional encoding** (Cantor function from 1883)
+✅ **Geometric compatibility classification**
+✅ **ResNet18 backbone** (proven CNN architecture)
+✅ **Triadic fractal mixing** (base-3 aligned)
+---
+**Repository**: https://huggingface.co/AbstractPhil/geo-beatrix-resnet
+**Author**: AbstractPhil
+**Framework**: PyTorch