AI & ML interests

Computer Vision Technology and Data Collection for Anime Waifu

Recent Activity

AbstractPhil 
posted an update 1 day ago
view post
Post
84
geolip-vit-x34 - 34 expert vit. I can't train an extended version of 34 vits, but I can definitely run some experiments and make some starter weights with an anchor. That would yield a substantial amount of data.

AbstractPhil/bulk-coco-features

This... is going to be a odd one to describe. Based on the research with Bert, creating a uniformed patchwork using a multitude of vit composites will be very achievable. It shouldn't be soup, which is really hard to explain, but by creating a second geometric anchor, the system will align in a way that I could never predict without many more model analysis and must test. I simply didn't test all these vits for geometry, so this will be the test.

This is essentially 34 directly extracted views of coco, which is already prepared feature data. With this data, we have 34 experts that can distill into a single unified vit. I'm hesitant to even call this distillation anymore, it's more interpolative data alignment, and it's absurdly retentive.

ADDITIONALLY, we can anchor to frozen geolip-bert and create cross-contrast between the anchors for a learned anchor median, which will allow further integrations directly into the geometric core.

This will require a few overlapping internal mechanisms to guarantee vit differentiation, however I believe the full unified patchwork will be... different from what is currently known as a vit.

geolip-bert-vit will likely be cooking within the month. The alignment statistics say it will be... 100% accurate to the specifications.

I CAN prepare 34 vits worth of imagenet, but I would need probably 34 vits worth of laion aesthetics, which is substantially more than I currently have. In the process I would need to ensure everything isn't corrupt, and the captions are correctly synthesized in our expert student bert with the correct anchoring rotation.

Probably 3 vits is enough for the full version prototype, 34 vits for the bulk experiment.
  • 3 replies
·
AbstractPhil 
posted an update 3 days ago
view post
Post
138
geolip-captionbert-8192

This bert is currently being distilled using 5 bert teachers using the conceptual captions dataset. The recall accuracy is based on the whitened procrustes alignment, and the losses reflect keeping that rotation aligned correctly.

The expectation from the smaller prototypes show this model will align to 100% accuracy recall based on the most optimal opinions based on the correct answer, aligning specifically to the correct answers in conjunction with all the geometric losses.

No joke, this may be the smallest, least computation, most accurate, and fastest bert I've trained thus far - and it will be based entirely on five teachers simultaneously feeding opinions through a relay hub.
  • 11 replies
·
AbstractPhil 
posted an update 5 days ago
view post
Post
115
I'll attempt to expand the geolip-clip to full sequence context window to encompass sequential learning.
AbstractPhil/geolip-clip-vit-large-patch14-ctx576
The memory pod is specifically meant to tune everything based on final state pooling, which is fine if you aren't trying to actually use sequential utility.
HOWEVER, there are many elemental biases that present themselves if attempting to USE the standard sequence of 77 in conjunction with this final pooled state. Even though the standard 77 is predominantly noise past token 10 it still houses considerable amounts of information in terms of utility, so this should be handled carefully. Zero-shot structures are a tricky structure to analyze, especially structures based on attention mechanisms instead of true sequential accumulation. I've noticed I need to watch them for quite a while before the real bugs show up.

As it stands the token pool is essentially [B, 7+8, 768] for pools. This contains a robust and highly complex representation of useful accumulated bidirectional attention data, so it's quite powerful.

I'll build a few prototypes and tap into some papers. I'll either come up with something or a reason why I didn't. The end result will either produce an anchor bank set of tokens [B, 15, 768] for pooling, or [B, 15, 77, 768] ideally - which should expand the sequence of the clip to 1,155 if successful. That doesn't necessarily mean this sequence will be more useful than the [b, 15, 768], but it will be representationally valid to the context window expansion.

I wouldn't hold out for a single full-sequence option in a single day, that's a lot of moving parts to analyze, not to mention highly impractical to train with. A smaller dose of this information would be necessary for rapid prototyping so it'll likely be packaged as such.

Well I spoke too soon. It's ready to play with.
AbstractPhil/geolip-clip-vit-large-patch14-ctx576-seq77
  • 4 replies
·
AbstractPhil 
posted an update 8 days ago
view post
Post
253
geolip-bertenstein-v1 - 5 experts chosen. A collective of shared transformer aligned experts, not a mixture of experts. Similar to a MOE, but not quite. This first prototype won't have the full mailing projection relay system afforded by the geofractal router, but it will definitely be a solid prototype.

It is not production ready yet, there needs to be a few upstream and downstream tools meant to consume and process the outputs to create useful representations.

This model will be able to text respond, use whisper, see with dinolip, code with codebert, and process proteins using esm2_t33_650m_ur50.

Our experts for the prototype are;
google-bert/bert-large-uncased
facebook/dinov2-large
microsoft/codebert-base
openai/whisper-large-v3
facebook/esm2_t33_650M_UR50

Not the smartest text model, but more than enough for this preliminary use case test setup. Text is predominantly meant to align and orient downward function, the entire machine is meant to be operated unilaterally as a collective, or independently through individual pairs requests via special token access.

This model will be capable of substantial power and feats as a prototype. It will be capable of seeing and processing differential equations utilizing dinov2 and esm2 data simultaneously, which can be used for downstream analysis - and I WILL use that data to create a more powerful connection between dinov2 tokens, protein tokens, video tokens, code tokens, and audio tokens.

This is the FIRST prototype of this case, and I will introduce video, genetics, shape analysis, pattern recognition processing, and a much more powerful and reusable text model.

The tests show the models can have differential communication through the geolip transformers after procrustes pairwise analysis and pentachoron CV protective measures.

Whitening procrustes for precalculation and center-aligning allows for a faster convergence, so that should help too.
  • 2 replies
·
AbstractPhil 
posted an update 9 days ago
view post
Post
1964
I've... done it. This, with experts, achieves near 100% R1 retrieval accuracy on an adjacent - unseen by the fusion transformer - dataset with around 40k steps from the seen dataset. This means the language of the models are at least tested fused within the constraints, not just projected or estimated.
AbstractPhil/geolip-procrustes

I encourage EVERYONE who is curious to check my work. Check it, double check it, and triple check it.

These were aligned using COCO and then validated with Flickr. Entirely different datasets. The experts arbitrated and the alignment yielded the correct answers. Preliminary tests show that with almost no alignment requirement, the models can reach 100% R1 retrieval accuracy.

Not to be confused with validation accuracy for a classification model or a text encoder's text response, this allows multispectral communication between entirely different models for direct downstream consumption with almost no training for the chosen models.

I have a working procrustes experiment that learns adjacent manifolds within a reasonable spectrum and the speed is... well, 1 epoch with COCO using Bert-Large and DinoV2 that allows the models to align nearly perfectly. For some scales in the experiment it shows that the 3 set epochs aren't quite enough to align R1 to highest, while many align nearly immediately.

These two were an obvious pair to pick, 60% similarity and >90% spectral similarity.

The trainer transfers layers, learns embeddings, and more - all by sticking strictly to geometric boundaries and procrustes informational accumulation within a modulation model's constraints.

I have many experiments to run.
  • 1 reply
·
AbstractPhil 
posted an update 10 days ago
view post
Post
262
The small projection-based approximator model for the geolip patchwork did not breach a certain level of accuracy as required by my specifications, so I've defaulted to harvesting direct geometric information from AI models until I get the comparative bounds required for a useful topology.

I must sincerely apologize for not solving this problem quickly.

This will take time. Without the approximator it's going to be considerably slower, but this model I begin training will be providing the approximations in a different way over time. As iterations progress, the system will conform to a huge array of geometric potentials and be capable at predicting those, but it will not be as powerful as the full patchmaker up front, and it will be slow training.

If I can get my hands on a cluster of A100's or H100's for a measure I'll make a post immediately, until then I must default to the slower process.

I really banked that the smaller version would have worked, but it simply couldn't hold complex topological shape without the correct boundaries being learnable AND endure entropic decay simultaneously. The only way to have a predominant shot at a full geometric shared language, is to make those boundaries learnable in the full spectrum of potentials, or at least more than I have placed on it.

I'll be refining my process in the coming days further, and I do apologize for pre-emptively announcing a potential that I have yet to fully explore.

There will be a full upgraded 38 shape geolip patchwork trained asap to fully encompass the Flux 1 AE spectrum, and another trained for SD15, SDXL, and Flux 2's VAE as well. These will accommodate DIRECT complex geometric patchwork learning, but not to the scale as promised yet. Autoregression is a complex mistress as many of you know, and I will be spending a great deal of time and compute analyzing all of the information required to build a uniformly useful and powerful autoregression patchwork to utilize as invariance to teaching.
  • 2 replies
·
AbstractPhil 
posted an update 17 days ago
view post
Post
1583
GLIP - Geometric Linear Interpolative Patchwork aka geolip.
https://github.com/AbstractEyes/glip-autoencoder

To tinker with the topology directly you can play with it here, though I admit it's imperfect in this form - it's quite the tinker toy to see the effects of patching.
https://claude.ai/public/artifacts/697287e4-fa18-4753-8b57-904d5e2022ed



This is the repo that will contain the next experimental stage, which is based entirely on the research and structural boundaries applied by said research. It'll be a little rigid while I get Claude set up.

In order to directly train these layered topological response patchworks you must install and use the geovocab2, geofractal, and wide_compiler repos.

This is due to the wide_compiler's wide_linear high-speed efficiency for ensemble processing, the geovocab2 factory structure with multiple formulas including highly efficient designs meant for kernel compilation, and a series of reusable utilities in geofractal including some of the more complex losses and difficult to optimally tune gate structures surrounding them.

Many of the underlying formulas are outlined here;
AbstractPhil/geometric-experiment-history

Utilization and training USING the pretrained or untrained geolip patchwork will be as simple as loading the model in pytorch and will not require external dependencies of the geolip package, numpy, or pytorch depending on the task. It will come packaged with recommended losses but I encourage experimentation because I simply cannot cover all spectrums.

More details to come as development progresses. The system is coming together and the state of the utilizable autoencoder will be ready within a couple weeks. The entire system is built for convenience and reusability, so the structure will be built similarly to autoencoder systems that currently exist, with a few tweaks here and there for important elements - so the interface will be familiar to those who use it.
  • 6 replies
·
Tonic 
posted an update 22 days ago
view post
Post
3240
🤔 Who would win ?

- a fully subsidized ai lab
OR
- 3 random students named
kurakurai
?

demo : Tonic/fr-on-device

if you like it give the demo a little star and send a shoutout to : @MaxLSB @jddqd and @GAD-cell for absolutely obliterating the pareto frontier of the french language understanding .
·
AbstractPhil 
posted an update 25 days ago
view post
Post
1340
The Rosetta Stone geometric vocabulary and the ramping up capacity.

What makes this particular invariant special, is the existence within all structures I've tested so far. I had Claude write up the direct article based on what we built together, but I've tested it on many substructures. This is flawed, and I have a series of answers to making it more accurate.

First a reconstruction from the ground up. This means each shape is specifically built upward from the substructure to the point of inductive deviance. This will be less quick at first and then build speed as I optimize like the last system did.

The "saddle" problem; the system detected saddles because there wasn't enough deviance in the shapes to attenuate to more cardinality and more aligned substructures. The blobs were around 30-40% of the overall patches, which interpolated into the others produced a fair approximation.
It MOST DEFINITELY did see those shapes in their voxel complexity. This is real.

https://claude.ai/public/artifacts/bf1256c7-726d-4943-88ad-d6addb263b8b
You can play with a public claude artifact dedicated to viewing the current shape spectrum - and with that know exactly why it's flawed.

The flawed and repetitive shapes. I rapid prototyped and there are multiple redundant shapes that simply don't classify well or at all. Not to mention the rotation simply doesn't help much of the time, or doesn't exist with many shapes. This will be rectified in the next variation.

Projecting to shared latent space as a catalyst to allow growing subjective geoflow matched step variance, rather than simply direct classification. This will theoretically allow for full channel-to-channel invariant features to be mapped from structure to structure, and the very formula that encapsulated them to be directly baked into the math rather than classified as a substructure analysis.

There are many challenges between here and there, so stay tuned my friends as I plot the geometric language of pretrained AI.
  • 2 replies
·
Tonic 
posted an update 26 days ago
view post
Post
3252
🙋🏻‍♂️hello my lovelies ,

it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment.

repo : Tonic/hugging-claw (use git clone to inspect)
literally the one-click link : Tonic/hugging-claw

you can also run it locally and see for yourself :

docker run -it -p 7860:7860 --platform=linux/amd64 \
-e HF_TOKEN="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \
-e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \
registry.hf.space/tonic-hugging-claw:latest


just a few quite minor details i'll take care of but i wanted to share here first
  • 2 replies
·
radna 
published a model about 1 month ago
AbstractPhil 
posted an update about 1 month ago
view post
Post
279
GeoFlow update — two training runs on the pentachoron geometric prior (4.8M params modulating frozen SD1.5).

10k ImageNet run fixed fragmented anatomy and spatial coherence in 7 minutes.

50k object-relations run taught actual compositional reasoning — "red cup on top of blue book" goes from a floating cup to correctly placed on the book.

Most interesting finding: learning happens in two phases. Object association locks first (~500 steps), spatial arrangement crystallizes after. You can watch it happen — "three candles in a triangle on a wooden tray" starts as candles side by side, then reorganizes into proper triangular formation. The tray itself rendered as a pentagon. Five vertices in, five sides out. The simplex is thinking in its own geometry.

Loss sits around 0.4 the entire time yet composition steadily improves. The prior nudges conditioning, it doesn't overwrite it.

Weights:
AbstractPhil/sd15-geoflow-object-association
Dataset:
AbstractPhil/synthetic-object-relations
Formulas:
AbstractPhil/sd15-rectified-geometric-matching

Next up — measuring the exact entropy decay inflection point across layers to enable branching the simplex into parallel paths with different anchor deviations. Geometric ensemble attention where the branches disagree on purpose.
AbstractPhil 
posted an update about 2 months ago
view post
Post
2275
AbstractPhil/tinyflux-experts
Introducing the "blot" expert, sd15-flow-sol. The twin sister flow-matching experts for tinyflux-lailah; sd15-flow-lune AND sd15-flow-sol will be used in tandem to train tinyflux-Lailah. sd15-flow-sol never managed to reach full flow-matching prediction, so epsilon vpred conversion is required. All experts will exist within the tinyflux-experts repo, including all the critical checkpoint sets.
Lune was heavily finetuned in the sd3-style and adapted shift timestep system after David's interpolation converted sd15 into geometric basis.
Sol was left abandoned after 50 epochs with David and was considered overcooked and rigid, until I noticed the geometric structure today. Lune doesn't produce geometric structure as solid as Sol, not even close. Lune produces improved fidelity and detail, but Sol produces something very very different, aligned to sd15's behavior, and fully representative of the 5point 4simplex structure that David brought to the table.

Sol is essentially a nearly perfect blob-forming geometric blotter. Sol is SD15, and yet SOL was trained using a specific pattern recognizing and timestep aligned David model. David was tasked with classifying timesteps and patterns using complex deep-recognition structural analysis layer-by-layer, determining full-scale opinions after watching the entirety of sd15's structure during training.

Even though the sd15-flow-sol was left abandoned, the structure of Sol is HIGHLY effective at understanding TIMESTEP blotting interpolation. I didn't realize how crucially important this was until Lailah started to show rigidity and compartmentalized behavior with sequence - which likely happens to ALL flow matching models.

AbstractPhil/sd15-flow-matching

AbstractPhil/geo-david-collective-sd15-distilled
AbstractPhil/geo-david-collective-sd15-base-e40
  • 3 replies
·
AbstractPhil 
posted an update about 2 months ago
view post
Post
987
Meet FluxLailah; AbstractPhil/tiny-flux-deep; 220m Flux variant currently pretraining at BF16. She is experimental, does not produce solid images yet - and yet she is producing. There is both an EMA and a raw weights pair producing different images. The EMA is particularly interesting at times.
Lailah uses flan-t5-base, clip-vit-l-14, and BlackForestLabs Flux1s VAE.
SEQ limit 128, images 512x512 for now. Lailah's early form is based on three variants. TinyFlux's weights were carefully planted into a deeper structure and trained yet again - dubbed TinyFlux-Deep. This variant has 15 dual-stream blocks and 25 single-stream blocks, nearly identical weight code as Flux with a similar attention mechanism - but intentionally deviant and compacted with careful consideration to scaling and purpose of mechanisms.
She went through quite a few growing pains with her earlier attention mechanism which required a reimagining today and careful consideration of the consequences, and now I present to you the preliminary look into Lailah.
The preliminary training is still heavily under way, the mechanisms are still being augmented, and her stability is currently being measured. The potential for fidelity, depth, and quality are still in measure - so I will be shifting attention and pivoting utility based on the needs over time.
  • 2 replies
·