AbstractPhil's picture
Update README.md
88addf6 verified
|
raw
history blame
2.77 kB
metadata
license: mit
datasets:
  - AbstractPhil/geometric-vocab
pipeline_tag: zero-shot-classification

Enabling the Mix-N-Cut

I've built a mix-n-cut that I've been avoiding enabling. This one is particularly formatted for pentachoron, so we'll see how it fares. I'm trying to build one as SMALL AS POSSIBLE< so if this mix-n-cut can pull the task out of the bag I may as well run it.

As it stands the tiny vits cap at 41% cifar100 with no augmentations. I've been running all the trains without a single special effect and only minimal normalization.

Lets see how the upcoming trains fare.

pixie_base_128d_patch4_128h

Pixie base has 10 layers with 5 goemetic and 5 multihead traditional attention. Lets see how the mix-n-cut fares with the earlier ones first, then we'll run the base.

The smaller ones seem to behave better using the geometric attention at 256 expert heads, which is odd to me but whatever works. They don't get much bigger with more experts, so I'll just try a tiny one with a ton of heads first.

Pentachoron Geometric Feature Extraction

Pentachora VIT are essentially micro-sized feature extractors that provide substantial accuracy for their small size. The more experiments I run, the smaller they become. The final goals to be a full clip-vit that can house the entirety of laion 400m in a fraction of the size and compute as OpenAI's clip-vit line. After this point I'll be confident the math is lined up well enough to train the true flagship - Beatrix.

The process of useful classification and feature extraction has been a non-trivial problem in the Computer Science industry for a long time.

This repo will house the various vit experiments that I frankenstein together; manifesting their weights and model codes in the repo itself.

As I am an independent researcher my resources are limited and I don't have the backing of any donors, so there will be time gaps unless some hardware is sliced off for me.

Many of my repos have certain elements omitted purposely for papers in writing, my thesis arguments, my statements about certain universal elements, and a multitude of other ramblings that I don't plan to release specific key details in full phonebook fashion for just ANY PERSON to read.

Let me use your high-end hardware. I deliver - success or failure, but I will deliver.

I will not rattle a tin cup for you. Work out a deal with me and you get the weights - I get the classes developed for further use, meant for public release.

Let me know if you're willing to work with me. I'll gladly share the code, the process, the progress, and the built accumulated warchest of potentials that this system entails if you provide me gateways to some hardware that I can utilize.

Until then, one experiment at a time.