AbstractPhil commited on
Commit
0614529
Β·
verified Β·
1 Parent(s): 6901dfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -1
README.md CHANGED
@@ -1,3 +1,123 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: geolip-svae
3
+ license: mit
4
+ pipeline_tag: feature-extraction
5
+ tags:
6
+ - pytorch
7
+ - autoencoder
8
+ - geometric-deep-learning
9
+ - spectral-autoencoder
10
+ - vector-quantization
11
+ - codebook
12
+ - geolip
13
+ datasets:
14
+ - wikitext
15
+ - zh-plus/tiny-imagenet
16
  ---
17
+
18
+ # geolip-aleph-void
19
+
20
+ A spherical autoencoder whose latent bottleneck **is** the *aleph signed-projective
21
+ address*: each row of the spherical matrix `M` is addressed to a **learned
22
+ projective codebook**, and the decoder reconstructs from the addressed rows. The
23
+ codebook address carries reconstruction β€” it is the model's load-bearing
24
+ operation, not a diagnostic read.
25
+
26
+ This repository hosts trained `AlephModel` checkpoints. The architecture and
27
+ loading code live in [`geolip-svae`](https://github.com/AbstractEyes/geolip-svae).
28
+
29
+ ## Usage
30
+
31
+ ```python
32
+ pip install "git+https://github.com/AbstractEyes/geolip-svae.git"
33
+ ```
34
+
35
+ ```python
36
+ from geolip_svae import load_model
37
+
38
+ # load a checkpoint hosted here (best.pt under a version folder)
39
+ model, cfg = load_model(hf_version="<version>", repo_id="AbstractPhil/geolip-aleph-void")
40
+ # ...or a local file
41
+ # model, cfg = load_model(checkpoint_path="aleph_soft_bt.pt")
42
+
43
+ out = model(images) # images: (B, 3, 64, 64)
44
+ recon = out["recon"] # reconstruction
45
+ addressed_M = out["svd"]["M_hat"] # spherical rows snapped to the codebook
46
+
47
+ codebook = model.codebook # learned (K, D) projective axes
48
+ axes2k = model.oriented_codebook() # the (2K, D) oriented half-axes
49
+ ```
50
+
51
+ `load_model` rebuilds the exact `AlephModel` from the checkpoint config (no side
52
+ files needed) and returns it on the available device in eval mode.
53
+
54
+ ## Architecture
55
+
56
+ - **Encoder** β€” patches β†’ a residual MLP β†’ the spectral matrix `M`, with rows
57
+ normalized onto `S^(D-1)`. Byte-identical to the `geolip-svae` PatchSVAE encoder.
58
+ - **Codebook** β€” `nn.Parameter` of `K` projective axes in `D` dimensions
59
+ (each axis is a line; sign is resolved by the address). Adds only `KΒ·D` params.
60
+ - **Aleph address (the bottleneck)** β€” signed alignments `cos = M @ Aα΅€`, a softmax
61
+ over the `2K` oriented half-axes `[+A; -A]`, yielding the addressed row
62
+ `MΜ‚ = Ξ£_k sinh(u_k)Β·A_k / Ξ£_k cosh(u_k)` with `u = cos/Ο„`. This is the exact
63
+ antipodal-softmax closed form, so the `2K` tensor is never materialized.
64
+ - **Decoder** β€” reconstructs the patch from `MΜ‚` (`tied` linear by default), so the
65
+ reconstruction gradient trains the codebook through the address.
66
+
67
+ Address modes: `soft` (differentiable mixture), `hard` (straight-through
68
+ discrete), `none` (`MΜ‚ = M`, the recon-real autoencoder that gated the design).
69
+
70
+ Reference config (byte-trigram): `V=32, D=4, patch_size=4, hidden=64, channels=3,
71
+ K=64, address_tau=0.1, decode_mode='tied'`.
72
+
73
+ ## Training data
74
+
75
+ - **byte_trigram** β€” WikiText-103 (`wikitext`, `wikitext-103-raw-v1`) rendered as
76
+ `64Γ—64Γ—3` byte-trigram images; the substrate that produced the SVAE byte
77
+ batteries, so reconstruction is directly comparable (baseline MSE β‰ˆ 3.8e-7).
78
+ - **tiny_imagenet** β€” `zh-plus/tiny-imagenet` at `64Γ—64` (cosine objective by
79
+ default; `cosine_mse` when amplitude matters).
80
+
81
+ ## Results and status
82
+
83
+ What is established on this lineage:
84
+
85
+ - **The matrix `M` is reconstruction-real.** With `address='none'`, a single
86
+ linear map off `M` drives byte-trigram reconstruction to cosine β‰ˆ 0.9997 β€” the
87
+ gate that justified building the address model.
88
+ - **The codebook addresses sharply and is void-rich.** Extracted codebooks from
89
+ this recon-real regime address real tokens more decisively than the
90
+ faux-embedding SVAE batteries (mean address margin β‰ˆ 0.967 vs β‰ˆ 0.929), and
91
+ carry markedly more 2-dimensional topological structure (Ξ²β‚‚ per axis β‰ˆ 0.56,
92
+ ~7Γ— the SVAE batteries).
93
+
94
+ The open question this model exists to answer β€” **whether reconstruction survives
95
+ the address bottleneck** (`address='soft'` vs the `none` gate) β€” is the active
96
+ experiment; per-checkpoint reconstruction and codebook-health metrics
97
+ (perplexity, address margin) are reported in each version's `final_report.json`.
98
+
99
+ ## Intended use and limitations
100
+
101
+ Research artifact for geometric representation learning and projective
102
+ codebook / spectral-autoencoder study. Not a production or general-purpose
103
+ generative model. Reconstruction targets the two training substrates above; the
104
+ codebook is small (`K=64`) and the address temperature is a tuned hyperparameter.
105
+ Pure-cosine training is scale-blind; codebook collapse is the relevant failure
106
+ mode and is monitored via perplexity (with an optional anti-collapse term).
107
+
108
+ ## Repository layout
109
+
110
+ ```
111
+ <version>/
112
+ β”œβ”€β”€ checkpoints/best.pt # load via load_model(hf_version="<version>")
113
+ └── final_report.json # config + reconstruction / codebook metrics
114
+ ```
115
+
116
+ ## Links
117
+
118
+ - Code: https://github.com/AbstractEyes/geolip-svae
119
+ - Spectral VAE batteries: https://huggingface.co/AbstractPhil/geolip-SVAE
120
+
121
+ ## License
122
+
123
+ Apache-2.0.