File size: 3,550 Bytes

615a1c7
 
ad4b6b9
 
 
 
 
c952a70
ad4b6b9
 
 
 
 
 
 
 
 
c952a70
ad4b6b9
 
 
 
 
 
 
 
 
fe5b87e
ad4b6b9
 
615a1c7
 
c952a70
309bb35
ad4b6b9
615a1c7
 
d5175ee
 
c037576
d5175ee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d8706fc
d5175ee
3fbbaa8
d5175ee
 
 
 
 
615a1c7
ad4b6b9
 
 
 
 
 
fe5b87e
 
c952a70
 
 
ad4b6b9
 
 
 
615a1c7
ad4b6b9
 
 
c952a70
 
 
 
 
d6e7bc0
 
 
ad4b6b9
 
 
 
 
 
c952a70
ad4b6b9

---
license: mit
tags:
- image-classification
- cifar100
- geometric-learning
- fractal-encoding
- in-training
- no-attention
- no-cross-entropy
datasets:
- cifar100
metrics:
- accuracy
library_name: pytorch
pipeline_tag: image-classification
model-index:
- name: geo-beatrix-fractal
  results:
  - task:
      type: image-classification
      name: Image Classification
    dataset:
      name: CIFAR-100
      type: cifar100
    metrics:
    - type: accuracy
      value: 69.08
      name: Test Accuracy
      verified: false
---

# geo-beatrix-fractal

**Geometric Basin Classification for CIFAR-100**


## Immediate Assessment

The geo-beatrix variation is more capable at classification and inferior to the robust geometric capacity than the vit-beatrix transformer structure provides.

geo-beatrix has a different form of math and a new basin format entirely dependent on teaching traditional structures new behavior.

The system is hit or miss, and will be refined over time as the model family evolves.

The reality sets in when the classification gets higher, that this is a more capable model than a vit - and yet that vit I built has a far more robust set of tooling and capacity for learning transfer.

I'd say this is too big for standard classification tasks, and yet the classifier system does work somewhat - just not as well as SOTA.

### Conclusion based on experimentation

This requires more experimentation on the subsystem before it can be utilized correctly. 
Optimizations need to happen to components, certain pieces need to be baselined to torch components for faster iterations. 
Even with loops removed this still has some issues with cantor stairs, but the batched stairs will be available on my repo today as well as the full model structure for the family of three here.

Alphamix and Fractalmix are hit-or-miss even with Cantor stairs, sometimes improving fidelity, sometimes reducing it.

Lacking attention mechanisms I consider this a resounding success as an experiment, and yet it fell short of resnet18 and resnet34 standalones - meaning the head only converted the math into something else, and fell short of the crossentropy goal.

That's okay though, I will refine the processes, improve the system, and return with additional trains for this version to further improve classifcation beyond the 69% chance - which may be HIGHER than the vit-beatrix, but it's considerably more shallow in comparison to geometric cohesion than the dual-stream transformer variation vit-beatrix-dualstream.


🚧 **Training Concluded** 🚧

Current Status: Idle

---

## Current Performance

| Metric | Value |
|--------|-------|
| **Best Test Accuracy** | **69.08%** |
| **Best Epoch** | 190 |
| **Current α (Cantor param)** | 0.4165 |
| **Total Parameters** | 45,161,489 |
| **Mixing Mode** | Fractal (triadic) |

---

## Architecture

- **Base**: ResNet-style with residual blocks
- **Channels**: 64 → 128 → 256 → 512 → 1024
- **Positional Encoding**: Devil's Staircase (Cantor function, 1883)
- **PE Levels**: 20
- **PE Features/Level**: 4
- **Classification**: Geometric Basin Compatibility
- **Attention**: NONE
- **Cross-Entropy**: NONE

---

## Innovation

✅ **NO attention mechanisms**  
✅ **NO cross-entropy loss**  
✅ **Fractal positional encoding** (Cantor function from 1883)  
✅ **Geometric compatibility classification**  
✅ **Triadic fractal mixing** (base-3 aligned)

---

**Repository**: https://huggingface.co/AbstractPhil/geo-beatrix  
**Author**: AbstractPhil  
**Framework**: PyTorch