| --- |
| license: mit |
| --- |
| # V2 good news |
|
|
| The decoder is differentiating. The features will be useful downstream. |
|
|
| ```python |
| CFG = dict( |
| # Architecture (inherited from Fresnel v50) |
| V=16, D=4, ps=4, hidden=384, depth=4, n_cross=2, |
| stage_hidden=128, stage_V=64, |
| |
| # Training |
| img_size=64, |
| batch_size=256, |
| lr=3e-4, |
| epochs=50, |
| ds_size=1280000, |
| val_size=10000, |
| |
| # CV soft hand |
| target_cv=0.2915, |
| cv_weight=0.3, |
| boost=0.5, |
| sigma=0.15, |
| |
| # Checkpointing |
| save_every=5, |
| val_per_type_every=5, |
| ) |
| ``` |
|
|
|
|
| ``` |
| Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured. |
| WARNING:huggingface_hub._login:Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured. |
| |
| ====================================================================== |
| SVAE v2 CONDUIT TRAINER β version2_v2_conduit_proto_2 |
| ====================================================================== |
| |
| Fresh PatchSVAEv2 from random init |
| Total params: 2,729,731 |
| |
| Dataset: 16 noise types, 1,280,000 samples/epoch |
| Image size: 64Γ64 |
| Batch size: 256 |
| |
| Initial conduit profile: |
| S: [2.512, 2.120, 1.776, 1.402] |
| S_std: [0.1320, 0.1130, 0.1230, 0.1728] |
| log_fric: [2.651, 4.560, 3.362, 2.203] Β± [1.131, 1.041, 0.715, 0.704] |
| fric_raw: mean=75.9 max=103510 |
| settle: [1.24, 2.26, 2.42, 1.00] (>2: 22.9%) |
| char_c: [0.5088, -2.6330, 4.8154, -3.6899] |
| refine: mean=6.46e-04 max=1.12e-03 |
| fric_cv: [4.4544, 1.8980, 2.1396, 3.3616] |
| |
| Initial MSE (random decoder): 2.0875 |
| ====================================================================== |
| Ep 1/50: 100%|ββββββββββββββββββββ| 5000/5000 [06:11<00:00, 13.44it/s, mse=0.0257 cv=1.027] |
| |
| ep 1 | recon=0.0848 val=0.0280 β
BEST | er=3.84 Sd=0.0954 cv=1.027 | 372s |
| S: [2.609, 2.196, 1.677, 1.220] |
| S_std: [0.1011, 0.1239, 0.1332, 0.1699] |
| log_fric: [2.355, 3.985, 3.071, 2.151] Β± [0.871, 0.664, 0.479, 0.662] |
| fric_raw: mean=40.0 max=363409 |
| settle: [1.12, 2.29, 2.84, 1.00] (>2: 29.3%) |
| char_c: [0.3458, -2.1159, 4.3382, -3.5587] |
| refine: mean=6.49e-04 max=1.14e-03 |
| fric_cv: [4.1030, 0.7286, 1.8343, 3.0191] |
| types: gaus=0.026 unif=0.012 unif=0.032 pois=0.010 pink=0.005 brow=0.006 salt=0.122 spar=0.018 bloc=0.012 grad=0.015 chec=0.010 mixe=0.013 stru=0.018 cauc=0.080 expo=0.024 lapl=0.045 |
| πΎ /content/version2_v2_conduit_proto_2_checkpoints/best.pt (29.4MB, ep1, MSE=0.028021) |
| ProcessingβFilesβ(1β/β1)ββββββ:β100% |
| β30.8MBβ/β30.8MB,β25.7MB/sββ |
| NewβDataβUploadβββββββββββββββ:β100% |
| β30.8MBβ/β30.8MB,β25.7MB/sββ |
| ββ...oto_2_checkpoints/best.pt:β100% |
| β30.8MBβ/β30.8MBββββββββββββ |
| ProcessingβFilesβ(1β/β1)ββββββ:β100% |
| β30.8MBβ/β30.8MB,ββ0.00B/sββ |
| NewβDataβUploadβββββββββββββββ:β |
| ββ0.00Bβ/ββ0.00B,ββ0.00B/sββ |
| ββ...oto_2_checkpoints/best.pt:β100% |
| β30.8MBβ/β30.8MBββββββββββββ |
| No files have been modified since last commit. Skipping to prevent empty commit. |
| WARNING:huggingface_hub.hf_api:No files have been modified since last commit. Skipping to prevent empty commit. |
| βοΈ Pushed ep1 |
| Ep 2/50: 100%|ββββββββββββββββββββ| 5000/5000 [06:13<00:00, 13.40it/s, mse=0.0323 cv=1.000] |
| |
| ep 2 | recon=0.0832 val=0.0327 | er=3.76 Sd=0.1165 cv=1.000 | 373s |
| S: [2.707, 2.329, 1.400, 1.113] |
| S_std: [0.0788, 0.0877, 0.1419, 0.1255] |
| log_fric: [2.375, 3.785, 2.958, 2.175] Β± [0.883, 0.508, 0.450, 0.681] |
| fric_raw: mean=32.1 max=104752 |
| settle: [1.15, 2.26, 3.01, 1.00] (>2: 30.0%) |
| char_c: [0.2042, -1.5304, 3.7516, -3.3900] |
| refine: mean=6.46e-04 max=1.12e-03 |
| fric_cv: [4.2588, 0.8009, 1.4941, 3.3563] |
| types: gaus=0.032 unif=0.011 unif=0.043 pois=0.008 pink=0.002 brow=0.002 salt=0.157 spar=0.020 bloc=0.007 grad=0.011 chec=0.005 mixe=0.012 stru=0.019 cauc=0.107 expo=0.029 lapl=0.060 |
| Ep 3/50: 20%|ββββ | 1019/5000 [01:16<04:56, 13.45it/s, mse=0.0328 cv=0.906] |
| ``` |
|
|
|
|
| # V2 Redux - full decoder overhaul |
|
|
| Cascade bottlenecking didn't cut it, the decoder still bypassed the specifications. |
|
|
| This next variation is going to be a bit excessive in terms of conduit adjudication. |
|
|
| Every single layer of the encoder is going to be a full encoder/decoder overhaul. |
|
|
| ``` |
| ENCODER (bottom β up): |
| Level 0: 256 patches β MLP(384) β M(48Γ4) β SVD+conduitβ β 256 tokens |
| Level 1: group 2Γ2 β 64 cells β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 64 tokens |
| Level 2: group 2Γ2 β 16 blocks β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 16 tokens |
| Level 3: group 2Γ2 β 4 groups β attend(4) β MLP(128) β M(16Γ4) β SVD+conduitβ β 4 tokens |
| Top: cross-attention over 4 final tokens |
| |
| SPECTRAL TOKEN (propagates between levels): |
| [S(4), log_friction(4), settle(4), char_coeffs(4)] = 16 values |
| S carries gradients. Conduit is detached. Difficulty trickles UP. |
| |
| DECODER (top β down, with conduit skips): |
| Level 3': 4 tokens β expand Γ 4 β inject conduitβ β attend β 16 tokens |
| Level 2': 16 tokens β expand Γ 4 β inject conduitβ β attend β 64 tokens |
| Level 1': 64 tokens β expand Γ 4 β inject conduitβ β attend β 256 tokens |
| Level 0': 256 tokens + stored (Uβ, Sβ, Vtβ, frictionβ, settleβ, char_cβ) β MLP β pixels |
| |
| CONDUIT AT EACH SCALE: |
| Level 0: friction from pixel-level Gram decomposition (how hard were patches?) |
| Level 1: friction from cell-level Gram decomposition (how hard were 2Γ2 interactions?) |
| Level 2: friction from block-level decomposition (how hard were meso-structures?) |
| Level 3: friction from global decomposition (how hard was the overall composition?) |
| ``` |
|
|
| It's a bit excessive, but it may be required. Everything has to have a little impurity, otherwise it will not deviate. |
|
|
| It's not coincidental why so many of these structures lined up. |
|
|
| This MAY have removed too much SVD encoding at the baseline, but we'll see. |
|
|
|
|
| # V2 is blobby! |
|
|
| Time to go direct, going to train the whole model with SVD-related paradigms internally rather than trying to feed the model SVD. |
|
|
| You can call this decoder an inverse cascade decoder. |
|
|
|
|
| # Deblobbing, the blob. |
|
|
| So as of the SVAE v2's official structure dictates, the decoder must account for the newly introduced elements to correctly decode. |
|
|
| This is the first experiment, currently proving that yes they can in fact learn to decode. |
|
|
| I've dubbed this noise variation of freckles - SVAE-Cadence, which is named appropriately the difficulty the decoder attenuation structure needs to be aware - |
| before the decoder can understand the orchestra's song. |
|
|
| Each of the new EIGH elements are specifically related to HOW WELL the model performed in the SVD calculation. This includes many elements related to |
| how many iterations required, how smooth the final structure was, and multiple other pieces. |
| ``` |
| THE DECODER RECEIVES: |
| S[4] β magnitudes |
| Vt[4Γ4] β orientations (sign-canonicalized) |
| friction[4] β conditioning per mode |
| settle[4] β convergence per mode |
| char_coeffs[4] β polynomial invariants |
| extraction_order[4] β spectral hierarchy |
| refinement_residual[1] β orthogonalization quality |
| release_residual[1] β round-trip fidelity |
| |
| THE DECODER DOES NOT RECEIVE: |
| M_hat = U @ diag(S) @ Vt β this is WITHHELD |
| |
| THE DECODER MUST RECONSTRUCT PATCHES FROM THE |
| DECOMPOSED SPECTRAL REPRESENTATION + CONDUIT. |
| IT CANNOT SHORTCUT. EVERY ELEMENT IS LOAD-BEARING. |
| ``` |
|
|
| With the new structured EIGH derived components, we now have a conduit for elemental extraction based on difficulty. |
|
|
|
|
| ``` |
| ENCODER (identical to v1, can copy weights from Fresnel): |
| patch(48) β MLP(384) β residual blocks Γ 4 β M(48Γ4) β normalize |
| |
| SVD + CONDUIT (always active): |
| M β G = M^T M β FLEighConduit(G) β S, U, Vt, packet |
| |
| CROSS-ATTENTION (identical to v1, can copy weights): |
| S β SpectralCrossAttention Γ 2 β S_coordinated |
| |
| CONDUIT DECODER (NEW β the forcing function): |
| For each mode k=0,1,2,3: |
| bundle_k = [U[:,k](48), S[k](1), Vt[k,:](4), friction[k](1), |
| settle[k](1), char_coeff[k](1), order[k](1)] |
| β ModeProcessor(57 β 384) β mode_hidden_k |
| |
| Fuse: [mode_0, mode_1, mode_2, mode_3, refine_res, release_res] |
| β Linear(1538 β 384) β residual blocks Γ 4 β patch(48) |
| ``` |
|
|
| This is SVD-Cadence learning noise. Already capable. |
|
|
| https://huggingface.co/AbstractPhil/geolip-conduit-experiments/blob/main/svae_cadence.py |
| |
| The code to train Cadence is included as per usual. |
| |
|  |
| |
|  |
| |
| I'll let it cook for a while. |
| |
| |
| # Thoughts |
| |
| This is a repo dedicated to a series of experiments specifically meant to introduce direct learning complexity associations |
| with the deconstruction of SVD and egens. |
| |
| The structure is highly complex in order to create an Omega solver that can transfer the learning that can be framewise used to an adjacent solver. |
| This will likely not work at first, or at second, or at 50th, but there is a prototype that I will be testing to the T. |
| |
| The three AI conversation helped get a starting point, but they provided less help that I expected. It's often better to just stick with |
| one assistant, as the echo of the tree tends to drown out positive or useful opinions from one or the other without having a judge intervene with |
| every single word exchange. |
| |
| The trio ended up forming a bit of an echo-frame, which may work but I will likely need to revamp the whole thing 2-3 more times before it can be extracted. |