AbstractPhil
/

penta-vit-experiments

Zero-Shot Classification

Model card Files Files and versions

Metrics Training metrics Community

AbstractPhil commited on Sep 17, 2025

Commit

7568d40

·

verified ·

1 Parent(s): 8f315a4

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -4,6 +4,24 @@ datasets:
 - AbstractPhil/geometric-vocab
 pipeline_tag: zero-shot-classification
 ---
 # Breakdown and assessment
 Using the standard vit position tokenization doesn't work. It caps our unique feature map representation to 65 tokens worth based on patch4, and due to the high-dimensional geometry bloating dimensions to such a degree - the representative patches can only be learned to a certain degree of accuracy before they simply run out of space and the model defaults to memorizing the training data.

 - AbstractPhil/geometric-vocab
 pipeline_tag: zero-shot-classification
 ---
+# Surgery again
+Alright. This initial cycle is concluded, I've determined that the frozen pentachoron are in fact utilizable by about 50% most of the time and will eventually cap at 60% if you stick to standard cross-entropy with tons of geometric regularization. So roughly 3/5ths of the pentas are covered and the other two completely discarded.
+I've begun analyzing feature sets for the variants at this point. The outputs are very promising in terms of potential, even if the methods used to calculate accuracy were... well suboptimal.
+I've learned that cross-entropy + geometry is a definite no-go. The pentas exist below zero, and the entire premise of their formulas exists to revolve around a dynamic position - so the entire premise of zero being the absolute normality is definitely... not something that cross-entropy agrees with.
+In any case, determining feature analysis tools is crucial today. I need to figure out exactly what's in them to properly calculate a variant that can actually be analyzed correctly, because these losses aren't cutting it.
+I've defaulted to a singular global entropy, simplex loss, geometry loss, and a few other potentials that can be used to train the clip-vit variant - but the bucketing itself - the 100 pentas, need to be correctly assessed and calculated before a larger variant can be created.
+There's more than enough weights, I just need to smash math today to see if I can project a proper lattice that conforms to the CM + Graham principles without restructuring the entire set of weights.
+It's possible that the accuracy is much higher than expected and that I'm simply not asking them the right question. That would be quite the thing wouldn't it.
 # Breakdown and assessment
 Using the standard vit position tokenization doesn't work. It caps our unique feature map representation to 65 tokens worth based on patch4, and due to the high-dimensional geometry bloating dimensions to such a degree - the representative patches can only be learned to a certain degree of accuracy before they simply run out of space and the model defaults to memorizing the training data.