AbstractPhil commited on
Commit
cf66ded
Β·
verified Β·
1 Parent(s): 9c2a7e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +169 -3
README.md CHANGED
@@ -1,3 +1,169 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ First barrage tests and compilation tests are finished. The orbital is yielding a true representation of omega.
6
+
7
+
8
+ ```
9
+ geolip_core.linalg: available
10
+ geolip.linalg backend:
11
+ CUDA: yes
12
+ Triton: 3.6.0
13
+ FL eigh: enabled
14
+ Triton SVD: enabled
15
+ GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition
16
+ ========================================================================
17
+ Flow Ensemble β€” Expanded Test Suite
18
+ ========================================================================
19
+ device=cuda geolip_core.linalg=True
20
+ GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition
21
+
22
+ ========================================================================
23
+ 1. SMOKE TEST
24
+ ========================================================================
25
+
26
+ Flow Params Shape Time Conf Res norm
27
+ ────────────────────── ──────── ────────────── ────────── ──────── ──────────
28
+ QuaternionFlow 70,725 (16, 64, 128) 264us 0.476 1.318
29
+ QuaternionLiteFlow 87,109 (16, 64, 128) 190us 0.465 1.279
30
+ VelocityFlow 70,210 (16, 64, 128) 170us 0.533 0.056
31
+ MagnitudeFlow 73,293 (16, 64, 128) 20.58ms 0.481 0.753
32
+ OrbitalFlow 25,445 (16, 64, 128) 20.70ms 0.551 3.464
33
+ AlignmentFlow 37,186 (16, 64, 128) 43.67ms 0.480 0.131
34
+
35
+ ========================================================================
36
+ 2. LINALG INTEGRATION
37
+ ========================================================================
38
+
39
+ Testing eigh dispatch in MagnitudeFlow and OrbitalFlow...
40
+ magnitude finite=True conf=0.477
41
+ orbital finite=True conf=0.501
42
+
43
+ Gram eigenspectrum: shape=(16, 12) range=[0.0001, 6.6260]
44
+ Eigenvector orth err: 8.39e+01
45
+
46
+ ========================================================================
47
+ 3. MULTI-SCALE
48
+ ========================================================================
49
+
50
+ OrbitalFlow across scales:
51
+ Config B n k d Time OK
52
+ ────────── ──── ───── ───── ───── ────────── ────
53
+ tiny 4 16 8 64 12.05ms OK
54
+ small 16 64 32 128 20.64ms OK
55
+ medium 32 128 64 256 20.71ms OK
56
+ large 64 256 128 256 20.75ms OK
57
+ wide 8 512 256 512 20.74ms OK
58
+
59
+ ========================================================================
60
+ 4. ENSEMBLE FUSION
61
+ ========================================================================
62
+
63
+ weighted: time=87.28ms norm=0.586 diversity=0.829
64
+ quaternion conf=0.476Β±0.002 res=1.322
65
+ quat_lite conf=0.465Β±0.002 res=1.285
66
+ velocity conf=0.533Β±0.003 res=0.056
67
+ magnitude conf=0.481Β±0.004 res=0.757
68
+ orbital conf=0.554Β±0.002 res=1.644
69
+ alignment conf=0.480Β±0.003 res=0.131
70
+
71
+ gated: time=87.36ms norm=0.598 diversity=0.829
72
+ quaternion conf=0.476Β±0.002 res=1.322
73
+ quat_lite conf=0.465Β±0.002 res=1.285
74
+ velocity conf=0.533Β±0.003 res=0.056
75
+ magnitude conf=0.481Β±0.004 res=0.757
76
+ orbital conf=0.554Β±0.002 res=1.644
77
+ alignment conf=0.480Β±0.003 res=0.131
78
+
79
+ residual: time=87.35ms norm=0.586 diversity=0.829
80
+ quaternion conf=0.476Β±0.002 res=1.322
81
+ quat_lite conf=0.465Β±0.002 res=1.285
82
+ velocity conf=0.533Β±0.003 res=0.056
83
+ magnitude conf=0.481Β±0.004 res=0.757
84
+ orbital conf=0.554Β±0.002 res=1.644
85
+ alignment conf=0.480Β±0.003 res=0.131
86
+
87
+ ========================================================================
88
+ 5. GRADIENT HEALTH
89
+ ========================================================================
90
+
91
+ Flow Loss Grad norm Status
92
+ ────────────────── ────────── ──────────── ────────
93
+ quaternion mse 8.90e-04 OK
94
+ quat_lite mse 7.05e-04 OK
95
+ velocity mse 2.78e-04 OK
96
+ magnitude mse 7.92e-04 OK
97
+ orbital mse 3.45e-03 OK
98
+ alignment mse 1.27e-03 OK
99
+ quaternion cosine 1.43e-01 OK
100
+ quat_lite cosine 1.17e-01 OK
101
+ velocity cosine 3.37e-02 OK
102
+ magnitude cosine 1.98e-01 OK
103
+ orbital cosine 2.82e-01 OK
104
+ alignment cosine 9.83e-02 OK
105
+ quaternion norm 1.09e-01 OK
106
+ quat_lite norm 8.79e-02 OK
107
+ velocity norm 2.06e-02 OK
108
+ magnitude norm 3.99e-01 OK
109
+ orbital norm 3.97e+00 OK
110
+ alignment norm 1.01e-01 OK
111
+
112
+ ========================================================================
113
+ 6. ABLATION (100 training steps, rotation target)
114
+ ========================================================================
115
+
116
+ Configuration MSE Params
117
+ ─────────────────────────────────── ────────── ──────────
118
+ QuaternionFlow 0.0010 280,709
119
+ QuaternionLiteFlow 0.0005 346,245
120
+ VelocityFlow 0.0061 279,682
121
+ MagnitudeFlow 0.0081 285,837
122
+ OrbitalFlow 0.0456 91,813
123
+ AlignmentFlow 0.0066 148,098
124
+ Quat + Orbital 0.0024 372,524
125
+ Velocity + Magnitude 0.0062 565,521
126
+ Orbital + Alignment 0.0188 239,913
127
+ Velocity + Orbital 0.1576 371,497
128
+ Full (weighted) 0.0046 1,432,390
129
+ /tmp/ipykernel_16934/1749400227.py:389: UserWarning: torch.linalg.svd: During SVD computation with the selected cusolver driver, batches 0, 1, 2, 3, 4, and other 27 batches failed to converge. A more accurate method will be used to compute the SVD as a fallback. Check doc at https://pytorch.org/docs/stable/generated/torch.linalg.svd.html (Triggered internally at /pytorch/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLib.cpp:701.)
130
+ U, _, Vh = torch.linalg.svd(C) # full dΓ—d SVD β€” not through geolip.linalg.svd
131
+ Full (residual) FAILED: linalg.svd: (Batch element 0):
132
+
133
+ ========================================================================
134
+ 7. COMPILE COMPATIBILITY
135
+ ========================================================================
136
+
137
+ Flow fullgraph Raw Compiled
138
+ ────────────────────── ──────────── ────────── ────────────
139
+ QuaternionFlow OK 269us 191us
140
+ QuaternionLiteFlow OK 191us 172us
141
+ VelocityFlow OK 176us 161us
142
+ /usr/local/lib/python3.12/dist-packages/torch/_inductor/compile_fx.py:321: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
143
+ warnings.warn(
144
+ /usr/local/lib/python3.12/dist-packages/torch/_inductor/lowering.py:1904: FutureWarning: `torch._prims_common.check` is deprecated and will be removed in the future. Please use `torch._check*` functions instead.
145
+ check(
146
+ MagnitudeFlow OK 20.68ms 733us
147
+ /usr/local/lib/python3.12/dist-packages/torch/_inductor/lowering.py:1904: FutureWarning: `torch._prims_common.check` is deprecated and will be removed in the future. Please use `torch._check*` functions instead.
148
+ check(
149
+ OrbitalFlow OK 20.71ms 732us
150
+ AlignmentFlow OK 9.11ms 9.02ms
151
+
152
+ ========================================================================
153
+ 8. MEMORY (B=32, n=128, k=64, d=256)
154
+ ========================================================================
155
+
156
+ Flow Peak MB
157
+ ────────────────────── ──────────
158
+ QuaternionFlow 36.2
159
+ QuaternionLiteFlow 28.3
160
+ VelocityFlow 48.0
161
+ MagnitudeFlow 24.2
162
+ OrbitalFlow 10.5
163
+ AlignmentFlow 60.0
164
+ Full ensemble 203.6
165
+
166
+ ========================================================================
167
+ Done.
168
+ ========================================================================
169
+ ```