Update README.md
Browse files
README.md
CHANGED
|
@@ -4,6 +4,24 @@ datasets:
|
|
| 4 |
- AbstractPhil/geometric-vocab
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
# Breakdown and assessment
|
| 8 |
|
| 9 |
Using the standard vit position tokenization doesn't work. It caps our unique feature map representation to 65 tokens worth based on patch4, and due to the high-dimensional geometry bloating dimensions to such a degree - the representative patches can only be learned to a certain degree of accuracy before they simply run out of space and the model defaults to memorizing the training data.
|
|
|
|
| 4 |
- AbstractPhil/geometric-vocab
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
| 7 |
+
# Surgery again
|
| 8 |
+
|
| 9 |
+
Alright. This initial cycle is concluded, I've determined that the frozen pentachoron are in fact utilizable by about 50% most of the time and will eventually cap at 60% if you stick to standard cross-entropy with tons of geometric regularization. So roughly 3/5ths of the pentas are covered and the other two completely discarded.
|
| 10 |
+
|
| 11 |
+
I've begun analyzing feature sets for the variants at this point. The outputs are very promising in terms of potential, even if the methods used to calculate accuracy were... well suboptimal.
|
| 12 |
+
|
| 13 |
+
I've learned that cross-entropy + geometry is a definite no-go. The pentas exist below zero, and the entire premise of their formulas exists to revolve around a dynamic position - so the entire premise of zero being the absolute normality is definitely... not something that cross-entropy agrees with.
|
| 14 |
+
|
| 15 |
+
In any case, determining feature analysis tools is crucial today. I need to figure out exactly what's in them to properly calculate a variant that can actually be analyzed correctly, because these losses aren't cutting it.
|
| 16 |
+
|
| 17 |
+
I've defaulted to a singular global entropy, simplex loss, geometry loss, and a few other potentials that can be used to train the clip-vit variant - but the bucketing itself - the 100 pentas, need to be correctly assessed and calculated before a larger variant can be created.
|
| 18 |
+
|
| 19 |
+
There's more than enough weights, I just need to smash math today to see if I can project a proper lattice that conforms to the CM + Graham principles without restructuring the entire set of weights.
|
| 20 |
+
|
| 21 |
+
It's possible that the accuracy is much higher than expected and that I'm simply not asking them the right question. That would be quite the thing wouldn't it.
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
|
| 25 |
# Breakdown and assessment
|
| 26 |
|
| 27 |
Using the standard vit position tokenization doesn't work. It caps our unique feature map representation to 65 tokens worth based on patch4, and due to the high-dimensional geometry bloating dimensions to such a degree - the representative patches can only be learned to a certain degree of accuracy before they simply run out of space and the model defaults to memorizing the training data.
|