File size: 10,136 Bytes
461d412 e1a0d53 45bae27 e1a0d53 ac4192c 6ebdd54 28011d1 ac4192c 42b23b5 a8abe0b 76251d6 a8abe0b 1db5eca a8abe0b 3457fd2 a8abe0b 461d412 747a57b 461d412 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | ---
license: mit
---
# V2 good news
The decoder is differentiating. The features will be useful downstream.
```python
CFG = dict(
# Architecture (inherited from Fresnel v50)
V=16, D=4, ps=4, hidden=384, depth=4, n_cross=2,
stage_hidden=128, stage_V=64,
# Training
img_size=64,
batch_size=256,
lr=3e-4,
epochs=50,
ds_size=1280000,
val_size=10000,
# CV soft hand
target_cv=0.2915,
cv_weight=0.3,
boost=0.5,
sigma=0.15,
# Checkpointing
save_every=5,
val_per_type_every=5,
)
```
```
Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
WARNING:huggingface_hub._login:Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
======================================================================
SVAE v2 CONDUIT TRAINER β version2_v2_conduit_proto_2
======================================================================
Fresh PatchSVAEv2 from random init
Total params: 2,729,731
Dataset: 16 noise types, 1,280,000 samples/epoch
Image size: 64Γ64
Batch size: 256
Initial conduit profile:
S: [2.512, 2.120, 1.776, 1.402]
S_std: [0.1320, 0.1130, 0.1230, 0.1728]
log_fric: [2.651, 4.560, 3.362, 2.203] Β± [1.131, 1.041, 0.715, 0.704]
fric_raw: mean=75.9 max=103510
settle: [1.24, 2.26, 2.42, 1.00] (>2: 22.9%)
char_c: [0.5088, -2.6330, 4.8154, -3.6899]
refine: mean=6.46e-04 max=1.12e-03
fric_cv: [4.4544, 1.8980, 2.1396, 3.3616]
Initial MSE (random decoder): 2.0875
======================================================================
Ep 1/50: 100%|ββββββββββββββββββββ| 5000/5000 [06:11<00:00, 13.44it/s, mse=0.0257 cv=1.027]
ep 1 | recon=0.0848 val=0.0280 β
BEST | er=3.84 Sd=0.0954 cv=1.027 | 372s
S: [2.609, 2.196, 1.677, 1.220]
S_std: [0.1011, 0.1239, 0.1332, 0.1699]
log_fric: [2.355, 3.985, 3.071, 2.151] Β± [0.871, 0.664, 0.479, 0.662]
fric_raw: mean=40.0 max=363409
settle: [1.12, 2.29, 2.84, 1.00] (>2: 29.3%)
char_c: [0.3458, -2.1159, 4.3382, -3.5587]
refine: mean=6.49e-04 max=1.14e-03
fric_cv: [4.1030, 0.7286, 1.8343, 3.0191]
types: gaus=0.026 unif=0.012 unif=0.032 pois=0.010 pink=0.005 brow=0.006 salt=0.122 spar=0.018 bloc=0.012 grad=0.015 chec=0.010 mixe=0.013 stru=0.018 cauc=0.080 expo=0.024 lapl=0.045
πΎ /content/version2_v2_conduit_proto_2_checkpoints/best.pt (29.4MB, ep1, MSE=0.028021)
ProcessingβFilesβ(1β/β1)ββββββ:β100%
β30.8MBβ/β30.8MB,β25.7MB/sββ
NewβDataβUploadβββββββββββββββ:β100%
β30.8MBβ/β30.8MB,β25.7MB/sββ
ββ...oto_2_checkpoints/best.pt:β100%
β30.8MBβ/β30.8MBββββββββββββ
ProcessingβFilesβ(1β/β1)ββββββ:β100%
β30.8MBβ/β30.8MB,ββ0.00B/sββ
NewβDataβUploadβββββββββββββββ:β
ββ0.00Bβ/ββ0.00B,ββ0.00B/sββ
ββ...oto_2_checkpoints/best.pt:β100%
β30.8MBβ/β30.8MBββββββββββββ
No files have been modified since last commit. Skipping to prevent empty commit.
WARNING:huggingface_hub.hf_api:No files have been modified since last commit. Skipping to prevent empty commit.
βοΈ Pushed ep1
Ep 2/50: 100%|ββββββββββββββββββββ| 5000/5000 [06:13<00:00, 13.40it/s, mse=0.0323 cv=1.000]
ep 2 | recon=0.0832 val=0.0327 | er=3.76 Sd=0.1165 cv=1.000 | 373s
S: [2.707, 2.329, 1.400, 1.113]
S_std: [0.0788, 0.0877, 0.1419, 0.1255]
log_fric: [2.375, 3.785, 2.958, 2.175] Β± [0.883, 0.508, 0.450, 0.681]
fric_raw: mean=32.1 max=104752
settle: [1.15, 2.26, 3.01, 1.00] (>2: 30.0%)
char_c: [0.2042, -1.5304, 3.7516, -3.3900]
refine: mean=6.46e-04 max=1.12e-03
fric_cv: [4.2588, 0.8009, 1.4941, 3.3563]
types: gaus=0.032 unif=0.011 unif=0.043 pois=0.008 pink=0.002 brow=0.002 salt=0.157 spar=0.020 bloc=0.007 grad=0.011 chec=0.005 mixe=0.012 stru=0.019 cauc=0.107 expo=0.029 lapl=0.060
Ep 3/50: 20%|ββββ | 1019/5000 [01:16<04:56, 13.45it/s, mse=0.0328 cv=0.906]
```
# V2 Redux - full decoder overhaul
Cascade bottlenecking didn't cut it, the decoder still bypassed the specifications.
This next variation is going to be a bit excessive in terms of conduit adjudication.
Every single layer of the encoder is going to be a full encoder/decoder overhaul.
```
ENCODER (bottom β up):
Level 0: 256 patches β MLP(384) β M(48Γ4) β SVD+conduitβ β 256 tokens
Level 1: group 2Γ2 β 64 cells β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 64 tokens
Level 2: group 2Γ2 β 16 blocks β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 16 tokens
Level 3: group 2Γ2 β 4 groups β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 4 tokens
Top: cross-attention over 4 final tokens
SPECTRAL TOKEN (propagates between levels):
[S(4), log_friction(4), settle(4), char_coeffs(4)] = 16 values
S carries gradients. Conduit is detached. Difficulty trickles UP.
DECODER (top β down, with conduit skips):
Level 3': 4 tokens β expand Γ 4 β inject conduitβ β attend β 16 tokens
Level 2': 16 tokens β expand Γ 4 β inject conduitβ β attend β 64 tokens
Level 1': 64 tokens β expand Γ 4 β inject conduitβ β attend β 256 tokens
Level 0': 256 tokens + stored (Uβ, Sβ, Vtβ, frictionβ, settleβ, char_cβ) β MLP β pixels
CONDUIT AT EACH SCALE:
Level 0: friction from pixel-level Gram decomposition (how hard were patches?)
Level 1: friction from cell-level Gram decomposition (how hard were 2Γ2 interactions?)
Level 2: friction from block-level decomposition (how hard were meso-structures?)
Level 3: friction from global decomposition (how hard was the overall composition?)
```
It's a bit excessive, but it may be required. Everything has to have a little impurity, otherwise it will not deviate.
It's not coincidental why so many of these structures lined up.
This MAY have removed too much SVD encoding at the baseline, but we'll see.
# V2 is blobby!
Time to go direct, going to train the whole model with SVD-related paradigms internally rather than trying to feed the model SVD.
You can call this decoder an inverse cascade decoder.
# Deblobbing, the blob.
So as of the SVAE v2's official structure dictates, the decoder must account for the newly introduced elements to correctly decode.
This is the first experiment, currently proving that yes they can in fact learn to decode.
I've dubbed this noise variation of freckles - SVAE-Cadence, which is named appropriately the difficulty the decoder attenuation structure needs to be aware -
before the decoder can understand the orchestra's song.
Each of the new EIGH elements are specifically related to HOW WELL the model performed in the SVD calculation. This includes many elements related to
how many iterations required, how smooth the final structure was, and multiple other pieces.
```
THE DECODER RECEIVES:
S[4] β magnitudes
Vt[4Γ4] β orientations (sign-canonicalized)
friction[4] β conditioning per mode
settle[4] β convergence per mode
char_coeffs[4] β polynomial invariants
extraction_order[4] β spectral hierarchy
refinement_residual[1] β orthogonalization quality
release_residual[1] β round-trip fidelity
THE DECODER DOES NOT RECEIVE:
M_hat = U @ diag(S) @ Vt β this is WITHHELD
THE DECODER MUST RECONSTRUCT PATCHES FROM THE
DECOMPOSED SPECTRAL REPRESENTATION + CONDUIT.
IT CANNOT SHORTCUT. EVERY ELEMENT IS LOAD-BEARING.
```
With the new structured EIGH derived components, we now have a conduit for elemental extraction based on difficulty.
```
ENCODER (identical to v1, can copy weights from Fresnel):
patch(48) β MLP(384) β residual blocks Γ 4 β M(48Γ4) β normalize
SVD + CONDUIT (always active):
M β G = M^T M β FLEighConduit(G) β S, U, Vt, packet
CROSS-ATTENTION (identical to v1, can copy weights):
S β SpectralCrossAttention Γ 2 β S_coordinated
CONDUIT DECODER (NEW β the forcing function):
For each mode k=0,1,2,3:
bundle_k = [U[:,k](48), S[k](1), Vt[k,:](4), friction[k](1),
settle[k](1), char_coeff[k](1), order[k](1)]
β ModeProcessor(57 β 384) β mode_hidden_k
Fuse: [mode_0, mode_1, mode_2, mode_3, refine_res, release_res]
β Linear(1538 β 384) β residual blocks Γ 4 β patch(48)
```
This is SVD-Cadence learning noise. Already capable.
https://huggingface.co/AbstractPhil/geolip-conduit-experiments/blob/main/svae_cadence.py
The code to train Cadence is included as per usual.


I'll let it cook for a while.
# Thoughts
This is a repo dedicated to a series of experiments specifically meant to introduce direct learning complexity associations
with the deconstruction of SVD and egens.
The structure is highly complex in order to create an Omega solver that can transfer the learning that can be framewise used to an adjacent solver.
This will likely not work at first, or at second, or at 50th, but there is a prototype that I will be testing to the T.
The three AI conversation helped get a starting point, but they provided less help that I expected. It's often better to just stick with
one assistant, as the echo of the tree tends to drown out positive or useful opinions from one or the other without having a judge intervene with
every single word exchange.
The trio ended up forming a bit of an echo-frame, which may work but I will likely need to revamp the whole thing 2-3 more times before it can be extracted. |