--- language: en license: mit library_name: pytorch tags: - vae - predictive-coding - neuroscience - gabor-splatting - computer-vision - biology datasets: - nielsr/flowers-102 --- # Neuro Splat — an artificial visual cortex (Gabor wave-packet VAE) **PerceptionLab / Antti Luode, with Claude (Opus 4.8), in dialogue with Gemini. Helsinki, June 2026.** > Do not hype. Do not lie. Just show. *An image is not a million pixels. It is a smooth map and a scatter of small bright wave-packets dropped where the detail lives.* --- ## What this model is This repo holds the weights (`model.pt`) for a **VAE whose decoder is a differentiable Gabor wave-packet splatter**. It is not a pixel-generating image model; it is a biologically-motivated representation experiment — an architecture that reproduces the *shape* of how V1 is thought to code images (localized Gabor atoms), used to test predictive coding and phase locking. - **Encoder** (CNN): image → a latent "concept". - **Decoder** (MLP): latent → the parameters of 512 localized Gabor atoms — position, scale, orientation, frequency, and a complex `(a, b)` coefficient per colour channel. - **Renderer**: splats those wave-packets onto the canvas; the image is their sum. Trained on **Oxford Flowers-102** at **128×128 with 512 packets**. ## The blur is the prior; reality is the sharpness Asked to generate a flower from a random latent, this model returns a soft, watercolour blob — and that is the correct, expected behaviour, not a failure. A generative prior with no input has to average over everything it cannot disambiguate, so it returns the low-frequency gist. To see it work, you give it eyes. Hook it to a webcam and the blurry top-down prior collides with raw bottom-up reality; the live frame supplies the phase the prior could not guess, the packets lock to it, and the gist sharpens. That collision — top-down prediction meets bottom-up residual — is the predictive-coding loop, made live. ## How to load it You need the architecture from the [ArtificialCortex repo](https://github.com/anttiluode/ArtificialCortex) (`splat_generator.py` defines `SplatVAE`). ```python import torch from huggingface_hub import hf_hub_download from splat_generator import SplatVAE # from the ArtificialCortex repo (the_splat/) ckpt = hf_hub_download("Aluode/Neuro_Splat", "model.pt") model = SplatVAE(image_size=128, latent=128, num_packets=512, chunk=64) model.load_state_dict(torch.load(ckpt, map_location="cpu")) model.eval() ``` ## How to run it ```bash pip install torch torchvision numpy opencv-python huggingface_hub # perception — watch the packets phase-lock to a live webcam (panel 3) python live_cortex_perception.py --model_path model.pt --image_size 128 --num_packets 512 # generation — the blurry priors from random latents (expected to be soft) python splat_generator.py --mode sample --resume model.pt --image_size 128 --num_packets 512 ``` `--image_size 128 --num_packets 512` must match these weights, or the `state_dict` will not load. ## The honest ledger **What this model shows:** - an image can be represented and learned as a sparse sum of localized Gabor wave-packets (the V1 / Olshausen–Field model, made trainable); - the phase-wrapping problem is dodged by outputting complex `(a, b)` coefficients instead of raw angles; - the correction half of predictive coding works live: a blurry top-down prior is sharpened by gradient descent against a bottom-up residual (the webcam frame). **What it is not, and what it lacks:** - not a photorealistic generator. It is a VAE with an amortized latent — sharp reconstructions, blurry-but-structured samples. It is interesting on biological representation, not on benchmarks against diffusion models; - the **prior is flower-only**: in-domain (a flower) the top-down gist genuinely helps; off-domain (a room, a face) it is a wrong guess that the live-frame fit overwrites, so there the loop is doing direct re-fitting, not prediction; - the **"floaters"**: on out-of-domain input, with an aggressive learning rate, the optimizer orphans some packets into tiny ultra-bright dots instead of coordinating them into edges. They *resemble* phosphenes, but the cause is an MSE optimizer with **no lateral inhibition between packets**, not the neural disinhibition that makes real phosphenes — a rhyme, not the mechanism. It points at the missing inhibitory coordination (the `grown_gates` line), not at a recreated brain; - it is 2D; relative units throughout; one trained model. **The bet (untouched):** that the phase-locked frame is a *felt* sharpening rather than a computed one. The model locates the mechanism in code that can fail; it does not touch the hard problem. ## Lineage A sub-organ of [`the_artificial_cortex`](https://github.com/anttiluode/ArtificialCortex) (`the_splat/`). The generator is the trained top-down prior; the live cortex is reality forcing that prior into focus. MIT. *The generator dreamed a blurry flower because it had nothing to look at. Open its eyes and reality supplies the phase the dream could not; the packets lock, the gist sharpens, and what they cannot yet coordinate, they see as stars.*