geolip-aleph-void

A spherical autoencoder whose latent bottleneck is the aleph signed-projective address: each row of the spherical matrix M is addressed to a learned projective codebook, and the decoder reconstructs from the addressed rows. The codebook address carries reconstruction β€” it is the model's load-bearing operation, not a diagnostic read.

This repository hosts trained AlephModel checkpoints. The architecture and loading code live in geolip-svae.

Usage

pip install "git+https://github.com/AbstractEyes/geolip-svae.git"
from geolip_svae import load_model

# load a checkpoint hosted here (best.pt under a version folder)
model, cfg = load_model(hf_version="<version>", repo_id="AbstractPhil/geolip-aleph-void")
# ...or a local file
# model, cfg = load_model(checkpoint_path="aleph_soft_bt.pt")

out = model(images)                  # images: (B, 3, 64, 64)
recon       = out["recon"]           # reconstruction
addressed_M = out["svd"]["M_hat"]    # spherical rows snapped to the codebook

codebook = model.codebook            # learned (K, D) projective axes
axes2k   = model.oriented_codebook() # the (2K, D) oriented half-axes

load_model rebuilds the exact AlephModel from the checkpoint config (no side files needed) and returns it on the available device in eval mode.

Architecture

  • Encoder β€” patches β†’ a residual MLP β†’ the spectral matrix M, with rows normalized onto S^(D-1). Byte-identical to the geolip-svae PatchSVAE encoder.
  • Codebook β€” nn.Parameter of K projective axes in D dimensions (each axis is a line; sign is resolved by the address). Adds only KΒ·D params.
  • Aleph address (the bottleneck) β€” signed alignments cos = M @ Aα΅€, a softmax over the 2K oriented half-axes [+A; -A], yielding the addressed row MΜ‚ = Ξ£_k sinh(u_k)Β·A_k / Ξ£_k cosh(u_k) with u = cos/Ο„. This is the exact antipodal-softmax closed form, so the 2K tensor is never materialized.
  • Decoder β€” reconstructs the patch from MΜ‚ (tied linear by default), so the reconstruction gradient trains the codebook through the address.

Address modes: soft (differentiable mixture), hard (straight-through discrete), none (MΜ‚ = M, the recon-real autoencoder that gated the design).

Reference config (byte-trigram): V=32, D=4, patch_size=4, hidden=64, channels=3, K=64, address_tau=0.1, decode_mode='tied'.

Training data

  • byte_trigram β€” WikiText-103 (wikitext, wikitext-103-raw-v1) rendered as 64Γ—64Γ—3 byte-trigram images; the substrate that produced the SVAE byte batteries, so reconstruction is directly comparable (baseline MSE β‰ˆ 3.8e-7).
  • tiny_imagenet β€” zh-plus/tiny-imagenet at 64Γ—64 (cosine objective by default; cosine_mse when amplitude matters).

Results and status

What is established on this lineage:

  • The matrix M is reconstruction-real. With address='none', a single linear map off M drives byte-trigram reconstruction to cosine β‰ˆ 0.9997 β€” the gate that justified building the address model.
  • The codebook addresses sharply and is void-rich. Extracted codebooks from this recon-real regime address real tokens more decisively than the faux-embedding SVAE batteries (mean address margin β‰ˆ 0.967 vs β‰ˆ 0.929), and carry markedly more 2-dimensional topological structure (Ξ²β‚‚ per axis β‰ˆ 0.56, ~7Γ— the SVAE batteries).

The open question this model exists to answer β€” whether reconstruction survives the address bottleneck (address='soft' vs the none gate) β€” is the active experiment; per-checkpoint reconstruction and codebook-health metrics (perplexity, address margin) are reported in each version's final_report.json.

Intended use and limitations

Research artifact for geometric representation learning and projective codebook / spectral-autoencoder study. Not a production or general-purpose generative model. Reconstruction targets the two training substrates above; the codebook is small (K=64) and the address temperature is a tuned hyperparameter. Pure-cosine training is scale-blind; codebook collapse is the relevant failure mode and is monitored via perplexity (with an optional anti-collapse term).

Repository layout

<version>/
β”œβ”€β”€ checkpoints/best.pt     # load via load_model(hf_version="<version>")
└── final_report.json       # config + reconstruction / codebook metrics

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train AbstractPhil/geolip-aleph-void