AbstractPhil commited on
Commit
1a60fa2
Β·
verified Β·
1 Parent(s): 88730bc

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - flow-matching
5
+ - diffusion
6
+ - geometric-deep-learning
7
+ - constellation
8
+ - geolip
9
+ - cifar10
10
+ - geometric-lookup
11
+ ---
12
+
13
+ # GeoLIP Spherical Diffusion Prototype
14
+
15
+ **Flow matching diffusion through constellation bottleneck on S^15.**
16
+
17
+ Four progressive experiments proving that geometric triangulation
18
+ on the unit hypersphere is a viable information bottleneck for
19
+ diffusion models β€” and that the binding constant 0.29154 emerges
20
+ from velocity matching through geometric lookup.
21
+
22
+ ## Experiments
23
+
24
+ ### v1 β€” Regulator (baseline)
25
+ Constellation as a side-channel regulator on feature maps.
26
+ Gate stayed at 6%. Constellation was decorative.
27
+ - Loss: 0.1900 | Params: 6.1M | Near 0.29: 0%
28
+
29
+ ### v2 β€” Skip Bypass (the sneaky test)
30
+ 268M parameter `Linear(16384, 16384)` skip projection alongside
31
+ the constellation bottleneck. The model was given every reason
32
+ to bypass the constellation. **It chose the constellation** β€” gate
33
+ at 11.8%, routing 88% through 768 triangulation dimensions.
34
+ - Loss: 0.1757 | Params: 287M | Near 0.29: 9%
35
+
36
+ ### v3 β€” Pure Constellation Bottleneck
37
+ Skip projection removed. Everything through S^15. Zero bypass.
38
+ Beat the 268M skip version with 8Γ— fewer bottleneck params.
39
+ Reconstruction cos_sim β‰ˆ 0 β€” the bottleneck is a geometric
40
+ lookup table, not an autoencoder.
41
+ - Loss: 0.1749 | Params: 36.6M | Near 0.29: 30%
42
+
43
+ ### v4 β€” Geometric Lookup Flow Matching (GLFM)
44
+ Three-stage pipeline: Address β†’ Condition β†’ Generate.
45
+ Multi-scale addressing (coarse + fine). 46% of anchors
46
+ converged within Β±0.05 of the binding constant 0.29154.
47
+ - Loss: 0.1754 | Params: 35.2M | Near 0.29: 46%
48
+
49
+ ## The 0.29154 Binding Constant
50
+
51
+ Anchor drift from home position converges toward 0.29154 radians
52
+ across all experiments. This constant has now appeared in:
53
+
54
+ | Domain | Architecture | Training |
55
+ |---|---|---|
56
+ | MinimalShunts | Binding/separation phase boundary | Contrastive |
57
+ | CLIP projections | Geometric transition | Contrastive |
58
+ | T5 generation | Alpha convergence | Language modeling |
59
+ | CaptionBERT | Phase boundary | Contrastive |
60
+ | **Flow matching** | **Max anchor drift** | **Velocity matching** |
61
+
62
+ The constant marks the boundary where anchors transition from
63
+ geometric frame holders to task-specific encoders.
64
+
65
+ ## Key Empirical Results
66
+
67
+ | Finding | Result |
68
+ |---|---|
69
+ | CV β‰ˆ 0.20 is geometry of S^15 | Precision-invariant, 1-bit to fp64 |
70
+ | Constellation relay preserves 99.4% cos_to_orig at depth 16 | vs 7.4% for attention |
71
+ | Model prefers constellation over 268M skip bypass | 88/12 split |
72
+ | 768 tri dims match 16384 unconstrained dims for velocity | cos 0.949 |
73
+ | Bottleneck doesn't reconstruct β€” it's a lookup table | cos_sim β‰ˆ 0 to input |
74
+ | Anchors self-organize: structural (<0.29) vs semantic (>0.29) | Confirmed across 4 versions |
75
+
76
+ ## Architecture β€” GLFM (v4)
77
+
78
+ ```
79
+ Stage 1 β€” ADDRESS
80
+ encoder(x_t) β†’ (B, 256, 8, 8)
81
+ coarse: pool β†’ proj β†’ S^15 β†’ triangulate (768d)
82
+ fine: per-pixel β†’ proj β†’ S^15 β†’ triangulate β†’ aggregate (768d)
83
+ address = concat(coarse, fine) = 1536d
84
+
85
+ Stage 2 β€” CONDITION
86
+ fuse(address + time_emb + class_emb + noise_emb) β†’ 1024d
87
+
88
+ Stage 3 β€” GENERATE
89
+ 4Γ— ResBlock(1024d) β†’ proj(16384d) β†’ reshape(256, 8, 8) β†’ decoder
90
+ ```
91
+
92
+ ## Files
93
+
94
+ ### Training Scripts
95
+ - `flow_match_relay.py` β€” v1: constellation as regulator
96
+ - `flow_match_constellation_bn.py` β€” v2: skip bypass test + v3: pure bottleneck
97
+ - `constellation_diffusion.py` β€” v3: pure bottleneck (no skip)
98
+ - `glfm.py` β€” v4: geometric lookup flow matching
99
+
100
+ ### Analysis Scripts
101
+ - `analyze_diffusion.py` β€” v1 analysis (10 tests)
102
+ - `analyze_bn.py` β€” v2 analysis (skip vs constellation ablation)
103
+ - `analyze_cd.py` β€” v3 analysis (pure bottleneck, drift histogram)
104
+ - `analyze_glfm.py` β€” v4 analysis (multi-scale, address separability)
105
+
106
+ ### Relay Prototype Scripts
107
+ - `constellation_relay.py` β€” v2 vectorized relay (19K params, 99.4% preservation)
108
+ - `cv_attention_test.py` β€” attention geometric analysis (10 tests)
109
+ - `rope_vs_relay.py` β€” RoPE comparison (7 tests)
110
+ - `hybrid_relay.py` β€” hybrid cross-token relay (GPT challenge)
111
+ - `pairwise_relay.py` β€” pairwise constellation relay
112
+ - `scale_sweep.py` β€” production-scale depth Γ— sequence length
113
+
114
+ ### HuggingFace Integration
115
+ - `configuration_flow_match.py` β€” PretrainedConfig
116
+ - `modeling_flow_match.py` β€” PreTrainedModel (AutoModel compatible)
117
+
118
+ ### Checkpoints (if present)
119
+ - `checkpoints/` β€” best checkpoints from each training run
120
+
121
+ ### Samples (if present)
122
+ - `samples/` β€” v1 regulator samples
123
+ - `samples_bn/` β€” v2/v3 bottleneck samples
124
+ - `samples_cd/` β€” v3 pure constellation samples
125
+ - `samples_glfm/` β€” v4 GLFM samples
126
+
127
+ ### Analysis Outputs (if present)
128
+ - `analysis/` β€” v1 analysis images
129
+ - `analysis_bn/` β€” v2 analysis images
130
+ - `analysis_cd/` β€” v3 analysis images
131
+ - `analysis_glfm/` β€” v4 analysis images
132
+
133
+ ## Part of the GeoLIP Ecosystem
134
+
135
+ - [geolip-constellation-core](https://huggingface.co/AbstractPhil/geolip-constellation-core)
136
+ - [geolip-diffusion-proto](https://huggingface.co/AbstractPhil/geolip-diffusion-proto) (v1/v2 regulator)
137
+ - [geolip package](https://pypi.org/project/geolip/)
138
+ - [glip-autoencoder](https://github.com/AbstractEyes/glip-autoencoder)