Update; non-provisional geometric form

The shape needs to be tweaked a bit. The model is trying to teach itself, so I'm likely going to try a couple tweaks that allow the triangulation to coexist better with the geometry.

I'll align this resonance yet.

Automodel functional

Requires custom code, run the code in a colab cell.

https://huggingface.co/AbstractPhil/geolip-vit-zana/blob/main/colab_automodel_test_cell.py

The geometric structure is intact, which is a fair standalone prototype but nowhere near complete. Ensuring the vit zana structure survived was a bit too annoying, a full refactor would have been easier but I wanted to preserve the very essence of the original while updating it to geolip. Probably not the best idea since the original didn't actually work very well.

https://huggingface.co/AbstractPhil/penta-vit-experiments

The original penta-vit-experiments was a good and hard fought test to build intrinsic geometric structure using the pentachoron shapes.

It eventually yielded some shape but it was just random chance. The current geolip-vit is yielding geometric structure in the constellation along with the model itself representing SOME cohesive data in the space. This will be refined in a more useful fashion so it retains and coalesces much faster through experimentation, as well as made considerably smaller.

Currently the primary anchor means are collapsing the std but that's fine, I'll get it worked out. The ksimplex volumes are a bit degenerate as well, but that's to be expected since I used 4simplex. Likely needs a bit more sampling on the sphere to preserve it more uniformly.

Proper balance is important for all things.

Result

After 7 run attempts I've managed to find a series of losses and settings that preserve the geometry.

Now it's time to use this preserved geometry to actually teach Zana how to behave.

As of Run 7 the geometric anchors have their access predominantly preserved. It's not perfect yet, but it's a big step towards independent cohesion.

I have achieved, standalone geometric.

The full manifold represents a perfectly geometrically valid spectrum. It worked.

Almost a perfectly aligned 0.2 CV ratio.

This model wasn't easy to preserve the original shape while still improving the core state of the model to a useful point.

Run 1

Degenerate anchors, 4/64 No procrustes alignment initialization, 10% dropout for anchors not enough, infonce invalid, alignment drift without control

Death by alignment anchor drift, I have a few ideas to prevent this.

This is a configuration and initialization issue.

Run 2 tweaks

BEFORE:                              AFTER:
  64 anchors, random init              30 anchors, QR orthogonal init
  20% dropout                          33% dropout
  8 compartments (8 anchors each)      6 compartments (5 anchors each)
  Uniform lr for all params            0.1× lr on constellation (3e-5 vs 3e-4)
  Random normal anchors                Exactly orthogonal via QR decomposition

Run 2 geometric collapse

The anchors collapse by epoch 5, leaving only 1 active layer. Predominantly defaulting to [CLS]

This is understandable considering it's not operating on projection principals. However, I do have a potential solution. My bag of tricks isn't exhausted just yet.

Run 3: Anchor diversity and law of averages geoscalar divergence.

Removing the CLS token

The model is having no issues converging with just the anchors, but the anchors eventually collapse to a replacement CLS token.

The model simply doesn't know what's best for the long term goals, so I must encourage the behavior to exist within the 80/20 distribution rule.

Anchor geometric diversity and law of average targeting should handle a fair amount of attenuation problems based on the anchor collapse.

Downloads last month
189
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support