AbstractPhil
/

vit-beatrix-dualstream

dual-stream-architecture

geometric-deep-learning

fractal-positional-encoding

geometric-diversity

Model card Files Files and versions

AbstractPhil commited on Oct 11, 2025

Commit

376c7c6

·

verified ·

1 Parent(s): 6d1c8ca

Update README.md

Files changed (1) hide show

README.md +24 -4

README.md CHANGED Viewed

@@ -9,8 +9,30 @@ tags:
 license: mit
 ---
 # ViT-Beatrix Dual-Stream with Geometric Diversity
 ## Current Experiment: beatrix-dualstream-base
 **Model Path**: `weights/beatrix-dualstream-base/20251009_030219/`
@@ -33,10 +55,8 @@ This model uses a class-aware geometric diversity loss that encourages:
 ## Performance
-- **Best Accuracy**: 0.0395
-- **Current Epoch**: 0
 - **Dataset**: CIFAR-100
 ---
-*Last updated: Epoch 0 | Best Accuracy: 0.0395*

 license: mit
 ---
+Last remembered highest accuracy; 66% accuracy, and it had a bunch of other stuff too that apparently didn't get pushed from the logger.
+Readme is busted, it uploaded a bad readme. I'll run test sets on all the models and accumulate a proper model list with accuracies asap.
+These currently defeat the standard vit-beatrix in terms of pure classification accuracy, while leaving both blocks nearly independent.
+This enables efficient transfer learning without high-decay processes, but the system is a bit jank.
+Today I plan to shore up the actual repo's capacity to ensure this sort of fault doesn't happen again, where I run something and lose tracking information.
+Additionally the train manifest from all models will likely be stored in an independent repo elsewhere for automated connection and linkage with the huggingface systems.
 # ViT-Beatrix Dual-Stream with Geometric Diversity
+This system is a dual-block transformer model inspired by Flux's dual-block structure.
+## Experimental Tests
+One set of blocks is devoted to the geometry while the other set is devoted to the images ingested.
+The accuracy of the geometry can be completely decoupled and the image portion zeroed to retrain if systems start to decay.
+This has shown robust capability with multiple lineage trains; the geometry being left in a "frozen" state yeilds by far the worst outcomes - yet I froze everything including the geometric cross-attention and the subsystems while leaving the image-end of the cross-attention scrambled and learning, so more than likely it relearned incorrect math and got stuck at around 20%.
 ## Current Experiment: beatrix-dualstream-base
 **Model Path**: `weights/beatrix-dualstream-base/20251009_030219/`
 ## Performance
+- **Best Accuracy**: 66.000%~ from memory
+- **Current Epoch**: 100 give or take required, sorry about this I'll get real data here asap.
 - **Dataset**: CIFAR-100
 ---