Add README — flow matching + constellation relay prototype
Browse files
README.md
CHANGED
|
@@ -1,3 +1,93 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- flow-matching
|
| 5 |
+
- diffusion
|
| 6 |
+
- geometric-deep-learning
|
| 7 |
+
- constellation
|
| 8 |
+
- geolip
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# GeoLIP Diffusion Prototype
|
| 12 |
+
|
| 13 |
+
**Flow matching diffusion with constellation relay as geometric regulator.**
|
| 14 |
+
|
| 15 |
+
This is an experimental prototype exploring whether fixed geometric reference frames
|
| 16 |
+
(constellation anchors on the unit hypersphere) can regulate the internal geometry
|
| 17 |
+
of a diffusion model's denoising network during generation.
|
| 18 |
+
|
| 19 |
+
## Architecture
|
| 20 |
+
|
| 21 |
+
```
|
| 22 |
+
Flow Matching ODE: x_t = (1-t)·x_0 + t·ε → predict v = ε - x_0
|
| 23 |
+
Sampler: Euler integration, t=1→0, 50 steps
|
| 24 |
+
|
| 25 |
+
UNet:
|
| 26 |
+
Encoder: [64@32×32] → [128@16×16] → [256@8×8]
|
| 27 |
+
Middle: ConvBlock + ★ Constellation Relay ★
|
| 28 |
+
Self-Attention (8×8 spatial)
|
| 29 |
+
ConvBlock + ★ Constellation Relay ★
|
| 30 |
+
Decoder: [256@8×8] → [128@16×16] → [64@32×32]
|
| 31 |
+
Output: Conv → 3×32×32 velocity prediction
|
| 32 |
+
```
|
| 33 |
+
|
| 34 |
+
## Constellation Relay
|
| 35 |
+
|
| 36 |
+
The relay operates at the bottleneck (256 channels at 8×8 spatial resolution).
|
| 37 |
+
It works in **channel mode**:
|
| 38 |
+
|
| 39 |
+
1. Global average pool the spatial dims → (B, 256) channel vector
|
| 40 |
+
2. Chunk into 16 patches of d=16
|
| 41 |
+
3. L2-normalize each patch to S^15 (the natural CV=0.20 dimension)
|
| 42 |
+
4. Multi-phase triangulation: 3 phases × 16 anchors = 48 distances per patch
|
| 43 |
+
5. Patchwork MLP processes triangulation → correction vector
|
| 44 |
+
6. Gated residual (gate init ≈ 0.047) scales the feature map
|
| 45 |
+
|
| 46 |
+
**Key property:** the relay preserves 99.4% geometric fidelity through 16
|
| 47 |
+
stacked layers where vanilla attention preserves only 7.4%. It acts as a
|
| 48 |
+
geometric checkpoint that prevents representation drift at the normalized
|
| 49 |
+
manifold boundaries between network blocks.
|
| 50 |
+
|
| 51 |
+
## What This Tests
|
| 52 |
+
|
| 53 |
+
The hypothesis: diffusion models discover that noise is a deterministic
|
| 54 |
+
routing system (DDIM proved this — same seed always produces same image).
|
| 55 |
+
The constellation operates on the same principle — fixed geometric anchors
|
| 56 |
+
as a reference frame that noise/data routes through. By inserting the relay
|
| 57 |
+
at the bottleneck, we test whether explicit geometric regulation improves
|
| 58 |
+
or changes the flow matching dynamics.
|
| 59 |
+
|
| 60 |
+
## Empirical Findings (from this research session)
|
| 61 |
+
|
| 62 |
+
| Finding | Result |
|
| 63 |
+
|---|---|
|
| 64 |
+
| CV ≈ 0.20 is the natural pentachoron volume regularity of S^15 | Confirmed across all precisions, 1-bit to fp64 |
|
| 65 |
+
| Effective geometric dimension of trained models ≈ 16 | Confirmed across 17+ architectures |
|
| 66 |
+
| Relay preserves 99.4% cos_to_orig through 16 layers | vs 7.4% for attention alone |
|
| 67 |
+
| fp8 triangulation preserves geometry perfectly | CV identical to fp32 at d=16 |
|
| 68 |
+
| Noise transforms are classifiable as deterministic routing | 100% accuracy on 8/10 transform families |
|
| 69 |
+
|
| 70 |
+
## Parameters
|
| 71 |
+
|
| 72 |
+
- Total: ~6.1M
|
| 73 |
+
- Relay: ~76K (1.2% of total)
|
| 74 |
+
- 2 relay modules at the bottleneck
|
| 75 |
+
|
| 76 |
+
## Training
|
| 77 |
+
|
| 78 |
+
- Dataset: CIFAR-10 (50K images)
|
| 79 |
+
- Flow matching: conditional ODE with class labels
|
| 80 |
+
- Optimizer: AdamW, lr=3e-4, cosine schedule
|
| 81 |
+
- 50 epochs, batch size 128
|
| 82 |
+
|
| 83 |
+
## Files
|
| 84 |
+
|
| 85 |
+
- `flow_match_relay.py` — complete training script
|
| 86 |
+
- `checkpoints/flow_match_best.pt` — best checkpoint
|
| 87 |
+
- `samples/` — generated samples at various epochs
|
| 88 |
+
|
| 89 |
+
## Part of the GeoLIP Ecosystem
|
| 90 |
+
|
| 91 |
+
- [geolip-constellation-core](https://huggingface.co/AbstractPhil/geolip-constellation-core) — classification with constellation
|
| 92 |
+
- [geolip package](https://pypi.org/project/geolip/) — geometric constraints for deep learning
|
| 93 |
+
- [glip-autoencoder](https://github.com/AbstractEyes/glip-autoencoder) — source repository
|