AbstractPhil commited on
Commit
873d9dc
Β·
verified Β·
1 Parent(s): 1f973d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +223 -3
README.md CHANGED
@@ -1,3 +1,223 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ # geolip-svae-implicit-solver-experiments
5
+
6
+ Empirical artifacts from the **projective-axis** discovery in trained
7
+ sphere-solver batteries (geolip-svae lineage, 2026-04-24 session).
8
+
9
+ ---
10
+
11
+ ## TL;DR
12
+
13
+ Every trained sphere-solver tested produces an M tensor whose rows,
14
+ when antipodal pairs are collapsed, form a uniformly-distributed
15
+ codebook on **ℝP^(D-1)**. The "32 points on a sphere" reading is a
16
+ mislabel. The trained geometry is projective.
17
+
18
+ Verified across **19 trained models** spanning D=3, D=4, D=5.
19
+
20
+ This means the "polygonal omega" we were searching for already exists
21
+ as the projective reader applied to sphere-trained M. We don't need a
22
+ new normalizer or architecture. The trained sphere-solver IS the
23
+ polygonal codebook; we just read it through antipodal-collapse.
24
+
25
+ ---
26
+
27
+ ## The data
28
+
29
+ ### Cross-D pattern at V=32
30
+
31
+ | D | Pairs collapsed | Axes | Deviation from uniform ℝP^(D-1) | Effective rank |
32
+ |---|-----------------|------|----------------------------------|----------------|
33
+ | 3 | 10 (62.5%) | 22 | -0.004 | 2.96 / 3 (99%) |
34
+ | 4 | 6 (37.5%) | 26 | +0.002 | 3.96 / 4 (99%) |
35
+ | 5 | 3 (18.7%) | 29 | +0.016 | 4.94 / 5 (99%) |
36
+
37
+ Pair-fraction halves with each D step. Axis count climbs toward V=32.
38
+ Deviation stays within Β±0.05 of uniform projective baseline at every D.
39
+
40
+ ### Per-noise codebook differentiation (h2-64, V=32 D=4, 16 batteries)
41
+
42
+ All 16 single-noise batteries projective-clean. Antipodal pair count
43
+ varies systematically with training distribution:
44
+
45
+ - 5 pairs (5 batteries): gaussian, checker, salt_pepper, poisson, rayleigh
46
+ β€” central-tendency distributions
47
+ - 6 pairs (3 batteries): uniform, cauchy, exponential
48
+ β€” heavy-tailed or symmetric
49
+ - 7 pairs (5 batteries): uniform_scaled, laplace, periodic, mixed, structural
50
+ β€” mid-complexity
51
+ - 8 pairs (3 batteries): block, gradient, lognormal
52
+ β€” structured / asymmetric
53
+
54
+ 13 of 16 batteries show positive deviation (axes slightly more spread
55
+ than uniform β€” the trainer prefers discriminative spread over perfect
56
+ uniformity).
57
+
58
+ ---
59
+
60
+ ## Method (named "projective collapse")
61
+
62
+ 1. Run gaussian inputs through trained sphere-solver, collect M [B, V, D]
63
+ 2. Average across samples β†’ canonical M_avg [V, D]
64
+ 3. Identify antipodal pairs via mutual-strongest matching:
65
+ - For each row i, find row j with most-negative cosine
66
+ - Pair (i, j) if cos(i, j) < -0.9 AND j's most-negative is i
67
+ - Greedy: strongest pairs claim first
68
+ 4. For each pair, take (row_i - row_j) / 2, renormalize β†’ axis vector
69
+ - Canonical sign: first nonzero coordinate positive
70
+ 5. Unpaired rows kept as-is with sign canonicalization
71
+ 6. Compute pairwise angles wrapped to [0, Ο€/2] via min(ΞΈ, Ο€-ΞΈ)
72
+ β€” this is the projective angle on ℝP^(D-1)
73
+ 7. Compare distribution mean against empirical uniform-ℝP^(D-1) baseline
74
+
75
+ **Verdict thresholds:**
76
+ - PROJECTIVE-CLEAN: |deviation| < 0.05, full rank, silhouette < 0.4,
77
+ secondary antipodal ≀ 3
78
+ - PROJECTIVE-MOSTLY: deviation and rank pass, other thresholds slip
79
+ - STRUCTURED / DEGENERATE: failures
80
+
81
+ ---
82
+
83
+ ## Repo contents
84
+
85
+ ### `implicit_solver_reports/`
86
+
87
+ Probe results from the four projective re-probes:
88
+
89
+ - **`A0_projective_reprobe.json` / `.png`** β€” G-Cand (D=3, V=32)
90
+ - 10 pairs, 22 axes, deviation -0.004 β†’ PROJECTIVE-CLEAN
91
+ - **`A1_projective_reprobe_h2a.json` / `.png`** β€” H2a (D=4, V=32)
92
+ - 6 pairs, 26 axes, deviation +0.002 β†’ PROJECTIVE-CLEAN
93
+ - **`A2_projective_h2_64_singles.json` / `.png`** β€” h2-64 batteries 0-15
94
+ - All 16 PROJECTIVE-CLEAN, axis count range 24-27
95
+ - **`A3_d5_spherical/`** β€” D=5 spherical training + integrated probe
96
+ - `A3_results.json` / `A3_summary.png` β€” three D=5 configs at V ∈ {16, 32, 64}
97
+ - `A3a_V16_D5_*/epoch_1_checkpoint.pt` β€” V=16 D=5 trained model
98
+ - `A3b_V32_D5_*/epoch_1_checkpoint.pt` β€” V=32 D=5 trained model
99
+ - `A3c_V64_D5_*/epoch_1_checkpoint.pt` β€” V=64 D=5 trained model
100
+
101
+ ### `phaseQ_reports/`
102
+
103
+ Q-sweep training artifacts (10 candidates at 1000 batches):
104
+
105
+ - **`Q_rank02_h64_V32_D4_*`** β€” H2a (the canonical D=4 sphere-solver
106
+ used in A1 probe). 40,227 params, MSE 0.00205.
107
+ - **`Q_rank09_h64_V32_D3_*`** β€” G-Cand (the D=3 model probed in A0).
108
+ 28,899 params, MSE 0.028.
109
+ - 8 other rank-ordered configs from the H2 / G-class characterization
110
+
111
+ Each variant directory contains `epoch_1_checkpoint.pt` and the
112
+ training report JSON.
113
+
114
+ ### `phaseR_reports/`
115
+
116
+ Sphere-packing test (3 configs, hypothesis falsified β€” see notes below):
117
+
118
+ - V=16, D=4 β€” predicted H2-LIKE, observed HYBRID (stab 0.74)
119
+ - V=8, D=4 β€” predicted H2-LIKE, observed DIFFUSE (failed to converge)
120
+ - V=20, D=3 β€” predicted H2-LIKE, observed HYBRID with 6/10 antipodal
121
+
122
+ Polytope-vertex-count packing was NOT a sufficient predictor of
123
+ H2-LIKE static-row behavior. The geometric pattern that actually holds
124
+ is the projective-axis structure, not polytope alignment.
125
+
126
+ ---
127
+
128
+ ## How to load a checkpoint
129
+
130
+ ```python
131
+ import torch
132
+ from huggingface_hub import hf_hub_download
133
+
134
+ ckpt_path = hf_hub_download(
135
+ repo_id="AbstractPhil/geolip-svae-implicit-solver-experiments",
136
+ filename="implicit_solver_reports/A3_d5_spherical/A3b_V32_D5_h64_dp0_nx0_adam/epoch_1_checkpoint.pt",
137
+ )
138
+ ckpt = torch.load(ckpt_path, map_location='cpu', weights_only=False)
139
+ state_dict = ckpt['model_state']
140
+ ```
141
+
142
+ To rebuild the model architecture, you need the same training config
143
+ used to train it (V, D, hidden, depth, n_cross, etc.). The
144
+ `ablation_configs.py` and `ablation_trainer.py` from the geolip-svae
145
+ working set are the source of truth.
146
+
147
+ ---
148
+
149
+ ## How to read a probe result
150
+
151
+ ```python
152
+ import json
153
+ from huggingface_hub import hf_hub_download
154
+
155
+ p = hf_hub_download(
156
+ repo_id="AbstractPhil/geolip-svae-implicit-solver-experiments",
157
+ filename="implicit_solver_reports/A2_projective_h2_64_singles.json",
158
+ )
159
+ with open(p) as f:
160
+ data = json.load(f)
161
+
162
+ # data['results_per_battery'] β€” per-battery probe metrics (16 batteries)
163
+ # data['aggregate'] β€” summary statistics across all 16
164
+ ```
165
+
166
+ Each per-battery entry contains:
167
+ - `pairs`, `n_axes`, `unpaired` β€” collapse counts
168
+ - `proj_angle_mean`, `uniform_baseline`, `deviation` β€” uniformity test
169
+ - `best_silhouette`, `best_cluster_k` β€” residual structure
170
+ - `effective_rank`, `utilization` β€” dimension utilization
171
+ - `secondary_antipodal` β€” further-collapse check
172
+ - `verdict` β€” PROJECTIVE-CLEAN / -MOSTLY / STRUCTURED / DEGENERATE
173
+ - `proj_angles_subset` β€” first 200 pairwise angles for plotting
174
+
175
+ ---
176
+
177
+ ## What this enables
178
+
179
+ 1. **The polygonal omega is not a normalizer β€” it's an inference-time
180
+ projection.** Training stays spherical (`F.normalize(M, dim=-1)`).
181
+ At inference, apply antipodal-collapse to extract axis codebook.
182
+
183
+ 2. **h2-64 is a library of 16 projective-axis codebooks**, one per
184
+ noise type. Each codebook has 24-27 axes on ℝPΒ³.
185
+
186
+ 3. **A `ProjectiveReader` module** can wrap the collapse + axis
187
+ extraction as a clean inference operator. No D-dependent special
188
+ cases β€” works at D ∈ {3, 4, 5} with the same code.
189
+
190
+ 4. **For downstream tasks** (image discrimination, quantization,
191
+ generation), the trained sphere-solvers can serve as pre-built
192
+ discrete codebooks. No new training required for the codebook.
193
+
194
+ ---
195
+
196
+ ## Open questions (not in this repo)
197
+
198
+ - Per-input rotation: G-Cand showed row stability 0.531 β€” meaning
199
+ rows rotate per-input. The projective reading describes WHICH axes
200
+ exist; this asks HOW they activate per input. May be the actual
201
+ capsule-like behavior, operating on top of the codebook substrate.
202
+ - Per-noise codebook similarity matrix: how geometrically similar are
203
+ the 16 h2-64 codebooks to each other? Could reveal noise-type
204
+ clustering.
205
+ - D β‰₯ 6 behavior: do antipodal pairs vanish entirely at very high D?
206
+ Cross-D pattern predicts ~1-2 pairs at D=6, ~0 at D=8+.
207
+
208
+ ---
209
+
210
+ ## Reproducibility
211
+
212
+ The probe scripts (A0/A1/A2/A3/A4) are not in this repo β€” they live
213
+ with the geolip-svae working set and depend on `ablation_configs.py`
214
+ and `ablation_trainer.py` from that codebase.
215
+
216
+ The trained checkpoints + JSON results in this repo are sufficient to
217
+ verify the empirical claims without rerunning training.
218
+
219
+ ---
220
+
221
+ ## License
222
+
223
+ Apache 2.0