geolip-aleph-void
A spherical autoencoder whose latent bottleneck is the aleph signed-projective
address: each row of the spherical matrix M is addressed to a learned
projective codebook, and the decoder reconstructs from the addressed rows. The
codebook address carries reconstruction β it is the model's load-bearing
operation, not a diagnostic read.
This repository hosts trained AlephModel checkpoints. The architecture and
loading code live in geolip-svae.
Usage
pip install "git+https://github.com/AbstractEyes/geolip-svae.git"
from geolip_svae import load_model
# load a checkpoint hosted here (best.pt under a version folder)
model, cfg = load_model(hf_version="<version>", repo_id="AbstractPhil/geolip-aleph-void")
# ...or a local file
# model, cfg = load_model(checkpoint_path="aleph_soft_bt.pt")
out = model(images) # images: (B, 3, 64, 64)
recon = out["recon"] # reconstruction
addressed_M = out["svd"]["M_hat"] # spherical rows snapped to the codebook
codebook = model.codebook # learned (K, D) projective axes
axes2k = model.oriented_codebook() # the (2K, D) oriented half-axes
load_model rebuilds the exact AlephModel from the checkpoint config (no side
files needed) and returns it on the available device in eval mode.
Architecture
- Encoder β patches β a residual MLP β the spectral matrix
M, with rows normalized ontoS^(D-1). Byte-identical to thegeolip-svaePatchSVAE encoder. - Codebook β
nn.ParameterofKprojective axes inDdimensions (each axis is a line; sign is resolved by the address). Adds onlyKΒ·Dparams. - Aleph address (the bottleneck) β signed alignments
cos = M @ Aα΅, a softmax over the2Koriented half-axes[+A; -A], yielding the addressed rowMΜ = Ξ£_k sinh(u_k)Β·A_k / Ξ£_k cosh(u_k)withu = cos/Ο. This is the exact antipodal-softmax closed form, so the2Ktensor is never materialized. - Decoder β reconstructs the patch from
MΜ(tiedlinear by default), so the reconstruction gradient trains the codebook through the address.
Address modes: soft (differentiable mixture), hard (straight-through
discrete), none (MΜ = M, the recon-real autoencoder that gated the design).
Reference config (byte-trigram): V=32, D=4, patch_size=4, hidden=64, channels=3, K=64, address_tau=0.1, decode_mode='tied'.
Training data
- byte_trigram β WikiText-103 (
wikitext,wikitext-103-raw-v1) rendered as64Γ64Γ3byte-trigram images; the substrate that produced the SVAE byte batteries, so reconstruction is directly comparable (baseline MSE β 3.8e-7). - tiny_imagenet β
zh-plus/tiny-imagenetat64Γ64(cosine objective by default;cosine_msewhen amplitude matters).
Results and status
What is established on this lineage:
- The matrix
Mis reconstruction-real. Withaddress='none', a single linear map offMdrives byte-trigram reconstruction to cosine β 0.9997 β the gate that justified building the address model. - The codebook addresses sharply and is void-rich. Extracted codebooks from this recon-real regime address real tokens more decisively than the faux-embedding SVAE batteries (mean address margin β 0.967 vs β 0.929), and carry markedly more 2-dimensional topological structure (Ξ²β per axis β 0.56, ~7Γ the SVAE batteries).
The open question this model exists to answer β whether reconstruction survives
the address bottleneck (address='soft' vs the none gate) β is the active
experiment; per-checkpoint reconstruction and codebook-health metrics
(perplexity, address margin) are reported in each version's final_report.json.
Intended use and limitations
Research artifact for geometric representation learning and projective
codebook / spectral-autoencoder study. Not a production or general-purpose
generative model. Reconstruction targets the two training substrates above; the
codebook is small (K=64) and the address temperature is a tuned hyperparameter.
Pure-cosine training is scale-blind; codebook collapse is the relevant failure
mode and is monitored via perplexity (with an optional anti-collapse term).
Repository layout
<version>/
βββ checkpoints/best.pt # load via load_model(hf_version="<version>")
βββ final_report.json # config + reconstruction / codebook metrics
Links
- Code: https://github.com/AbstractEyes/geolip-svae
- Spectral VAE batteries: https://huggingface.co/AbstractPhil/geolip-SVAE