Update README.md
Browse files
README.md
CHANGED
|
@@ -4,6 +4,18 @@ datasets:
|
|
| 4 |
- AbstractPhil/geometric-vocab
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
# Surgery again
|
| 8 |
|
| 9 |
Alright. This initial cycle is concluded, I've determined that the frozen pentachoron are in fact utilizable by about 50% most of the time and will eventually cap at 60% if you stick to standard cross-entropy with tons of geometric regularization. So roughly 3/5ths of the pentas are covered and the other two completely discarded.
|
|
|
|
| 4 |
- AbstractPhil/geometric-vocab
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
| 7 |
+
# A few more variants first
|
| 8 |
+
|
| 9 |
+
There are a few unexplored elements with rose5 that I need to explore now that I take stock of the full roster.
|
| 10 |
+
|
| 11 |
+
As it stands, the majority of consistency comes directly from cosine with hypersphere using penta as a variant form of vector lattice. It works yes, but it's also not what the models are supposed to be doing.
|
| 12 |
+
|
| 13 |
+
If left running too long, the variants show that the cosine primarily collapses to head-dependent, which means eventually the model eventually just falls into the classification state.
|
| 14 |
+
|
| 15 |
+
It seems only one loss is necessary, and that one loss is a combination of rose (multi cosine) mixed with alignment and margin losses.
|
| 16 |
+
|
| 17 |
+
If my hunch is correct, centroid may be much weaker than rose5 with all the losses except geometric rose turned off.
|
| 18 |
+
|
| 19 |
# Surgery again
|
| 20 |
|
| 21 |
Alright. This initial cycle is concluded, I've determined that the frozen pentachoron are in fact utilizable by about 50% most of the time and will eventually cap at 60% if you stick to standard cross-entropy with tons of geometric regularization. So roughly 3/5ths of the pentas are covered and the other two completely discarded.
|