|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- image-classification |
|
|
- cifar100 |
|
|
- geometric-learning |
|
|
- fractal-encoding |
|
|
- trained |
|
|
- no-attention |
|
|
- no-cross-entropy |
|
|
datasets: |
|
|
- cifar100 |
|
|
metrics: |
|
|
- accuracy |
|
|
library_name: pytorch |
|
|
pipeline_tag: image-classification |
|
|
model-index: |
|
|
- name: geo-beatrix-resnet34-step20-feats1000 |
|
|
results: |
|
|
- task: |
|
|
type: image-classification |
|
|
name: Image Classification |
|
|
dataset: |
|
|
name: CIFAR-100 |
|
|
type: cifar100 |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 56.12 |
|
|
name: Test Accuracy |
|
|
verified: false |
|
|
--- |
|
|
|
|
|
# geo-beatrix-resnet34-step20-feats1000 |
|
|
|
|
|
**Geometric Basin Classification for CIFAR-100** |
|
|
|
|
|
π **Training Complete** π |
|
|
|
|
|
Final Status: Epoch 200/200 |
|
|
|
|
|
--- |
|
|
|
|
|
## Current Performance |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| **Best Test Accuracy** | **56.12%** | |
|
|
| **Best Epoch** | 160 | |
|
|
| **Current Train Accuracy** | 59.29% | |
|
|
| **Current Test Accuracy** | 51.51% | |
|
|
| **Current Ξ± (Cantor param)** | 0.4031 | |
|
|
| **Total Parameters** | 28,561,101 | |
|
|
| **Training Time** | 0:27:18 | |
|
|
|
|
|
### All Training Runs |
|
|
|
|
|
Autogen bug, they all have different test accs. |
|
|
|
|
|
| Timestamp | Status | Best Epoch | Test Acc | Train Acc | Ξ± | |
|
|
|-----------|--------|------------|----------|-----------|---| |
|
|
| `20251010_203717` | β
| 160 | **56.12%** | 67.82% | 0.4481 | |
|
|
| `20251010_211210` | π | 160 | **56.12%** | 16.21% | 0.3879 | |
|
|
| `20251010_213807` | β
| 160 | **56.12%** | 64.44% | 0.4419 | |
|
|
| `20251010_230300` | β
| 160 | **56.12%** | 52.13% | 0.4997 | |
|
|
| `20251010_234239` | β
| 160 | **56.12%** | 73.34% | 0.4882 | |
|
|
| `20251011_002858` | β
| 160 | **56.12%** | 46.05% | 0.4712 | |
|
|
| `20251011_012453` | β
| 160 | **56.12%** | 40.18% | 0.4963 | |
|
|
| `20251011_023128` | β
| 160 | **56.12%** | 54.65% | 0.5005 | |
|
|
| `20251011_025919` | β
| 160 | **56.12%** | 57.80% | 0.4994 | |
|
|
| `20251011_032343` | β
| 160 | **56.12%** | 53.80% | 0.4377 | |
|
|
| `20251011_034748` | β
| 160 | **56.12%** | 65.10% | 0.4326 | |
|
|
| `20251011_041716` | β
| 160 | **56.12%** | 59.29% | 0.4031 | |
|
|
| `20251010_200842` | β
| 180 | **53.61%** | 67.53% | 0.4442 | |
|
|
| `20251010_185133` | β
| 200 | **52.97%** | 69.87% | 0.4452 | |
|
|
|
|
|
### Comparison to State-of-the-Art |
|
|
|
|
|
| Model | Accuracy | Status | |
|
|
|-------|----------|--------| |
|
|
| **geo-beatrix (this model)** | **56.12%** | β
Complete | |
|
|
| geo-beatrix (50M params) | 69.0% | Geometric Basin CONV architecture | |
|
|
|
|
|
π― **Current target**: Beat geo-beatrix (69.0%) - Currently -12.88% |
|
|
|
|
|
--- |
|
|
|
|
|
## Architecture |
|
|
|
|
|
- **Base**: ResNet34 (torchvision) |
|
|
- **Pretrained**: From scratch |
|
|
- **Features**: 512-dim from ResNet34 |
|
|
- **Positional Encoding**: Devil's Staircase (Cantor function, 1883) |
|
|
- **PE Levels**: 20 |
|
|
- **PE Features/Level**: 1000 |
|
|
- **Classification**: Geometric Basin Compatibility (NO cross-entropy) |
|
|
- **Attention Mechanisms**: NONE |
|
|
- **Mixing**: Standard (single patch) |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
```json |
|
|
{ |
|
|
"model_name": "geo-beatrix-resnet34-step20-feats1000", |
|
|
"model_type": "geometric_basin_classifier", |
|
|
"num_classes": 100, |
|
|
"batch_size": 512, |
|
|
"num_epochs": 200, |
|
|
"base_learning_rate": 0.001, |
|
|
"weight_decay": 0.05, |
|
|
"warmup_epochs": 10, |
|
|
"pe_levels": 20, |
|
|
"pe_features_per_level": 1000, |
|
|
"dropout": 0.1, |
|
|
"pretrained_resnet": false, |
|
|
"frozen_resnet": false, |
|
|
"a100_optimizations": { |
|
|
"mixed_precision": true, |
|
|
"torch_compile": false, |
|
|
"channels_last": true, |
|
|
"gradient_checkpointing": false |
|
|
}, |
|
|
"alphamix": { |
|
|
"enabled": true, |
|
|
"fractal_mode": false, |
|
|
"range": [ |
|
|
0.3, |
|
|
0.7 |
|
|
], |
|
|
"spatial_ratio": 0.1, |
|
|
"curriculum_start": 0.0, |
|
|
"curriculum_end": 0.75, |
|
|
"fractal_steps": [ |
|
|
1, |
|
|
3 |
|
|
], |
|
|
"fractal_scales": [ |
|
|
0.3333333333333333, |
|
|
0.1111111111111111, |
|
|
0.037037037037037035 |
|
|
] |
|
|
}, |
|
|
"architecture": "ResNet34 + Devil's Staircase PE", |
|
|
"loss_function": "Geometric Basin Compatibility", |
|
|
"cross_entropy": false, |
|
|
"attention_mechanisms": false, |
|
|
"timestamp": "20251011_041716" |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Files Structure |
|
|
|
|
|
``` |
|
|
βββ model.pt (BEST overall model - easy access!) |
|
|
βββ model.safetensors (BEST overall model - easy access!) |
|
|
βββ best_model_info.json (which epoch/run this came from) |
|
|
βββ runs_history.json (all training runs and their results) |
|
|
βββ README.md |
|
|
βββ weights/geo-beatrix-resnet34-step20-feats1000/20251011_041716/ |
|
|
β βββ model.pt (best from this training run) |
|
|
β βββ model.safetensors (best from this training run) |
|
|
β βββ config.json |
|
|
β βββ training_log.txt |
|
|
β βββ checkpoints/ |
|
|
β βββ checkpoint_epoch_50.safetensors |
|
|
β βββ checkpoint_epoch_100.safetensors |
|
|
β βββ checkpoint_epoch_150.safetensors |
|
|
β (snapshots every 10 epochs) |
|
|
βββ runs/geo-beatrix-resnet34-step20-feats1000/20251011_041716/ |
|
|
βββ events.out.tfevents.* (TensorBoard logs) |
|
|
βββ metrics.csv (training metrics) |
|
|
``` |
|
|
|
|
|
**Note**: The root `model.pt` and `model.safetensors` always contain the best model across all training runs! |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
import torch |
|
|
|
|
|
# EASIEST: Download BEST overall model from root (recommended!) |
|
|
from safetensors.torch import load_file |
|
|
model_path = hf_hub_download( |
|
|
repo_id="AbstractPhil/geo-beatrix-resnet", |
|
|
filename="model.safetensors" |
|
|
) |
|
|
state_dict = load_file(model_path) |
|
|
# model.load_state_dict(state_dict) |
|
|
|
|
|
# Check which epoch/run the best model came from |
|
|
info_path = hf_hub_download( |
|
|
repo_id="AbstractPhil/geo-beatrix-resnet", |
|
|
filename="best_model_info.json" |
|
|
) |
|
|
with open(info_path) as f: |
|
|
best_info = json.load(f) |
|
|
print(f"Best model: epoch {best_info['epoch']}, {best_info['test_accuracy']:.2f}%") |
|
|
|
|
|
# Or download from specific training run |
|
|
model_path = hf_hub_download( |
|
|
repo_id="AbstractPhil/geo-beatrix-resnet", |
|
|
filename="weights/geo-beatrix-resnet34-step20-feats1000/20251011_041716/model.safetensors" |
|
|
) |
|
|
|
|
|
# Download specific epoch checkpoint |
|
|
epoch_checkpoint = hf_hub_download( |
|
|
repo_id="AbstractPhil/geo-beatrix-resnet", |
|
|
filename="weights/geo-beatrix-resnet34-step20-feats1000/20251011_041716/checkpoints/checkpoint_epoch_100.safetensors" |
|
|
) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Training History |
|
|
|
|
|
### Best Checkpoint |
|
|
- Epoch: 160 |
|
|
- Train Acc: 59.43% |
|
|
- Test Acc: 51.64% |
|
|
- Alpha: 0.4071 |
|
|
- Loss: 0.7570 |
|
|
|
|
|
### Latest 5 Epochs |
|
|
|
|
|
- **Epoch 196**: Train 62.03%, Test 0.00%, Ξ±=0.4032, Loss=0.7300 |
|
|
- **Epoch 197**: Train 59.02%, Test 0.00%, Ξ±=0.4031, Loss=0.6201 |
|
|
- **Epoch 198**: Train 58.49%, Test 0.00%, Ξ±=0.4031, Loss=0.6571 |
|
|
- **Epoch 199**: Train 59.32%, Test 0.00%, Ξ±=0.4031, Loss=0.6543 |
|
|
- **Epoch 200**: Train 59.29%, Test 51.51%, Ξ±=0.4031, Loss=0.6505 |
|
|
|
|
|
### Training Milestones |
|
|
- π― **50% Accuracy** reached at epoch 120 |
|
|
- π **Ξ± β₯ 0.40** reached at epoch 17 |
|
|
|
|
|
--- |
|
|
|
|
|
## Innovation |
|
|
|
|
|
β
**NO attention mechanisms** |
|
|
β
**NO cross-entropy loss** |
|
|
β
**Fractal positional encoding** (Cantor function from 1883) |
|
|
β
**Geometric compatibility classification** |
|
|
β
**ResNet34 backbone** (proven CNN architecture) |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
**Repository**: https://huggingface.co/AbstractPhil/geo-beatrix-resnet |
|
|
**Author**: AbstractPhil |
|
|
**Framework**: PyTorch |
|
|
|