================================================================================ FLOW MATCH RELAY — FULL ANALYSIS TOOLKIT Device: cuda ================================================================================ Loading weights: 100%  169/169 [00:00<00:00, 4115.48it/s, Materializing param=unet.time_emb.3.weight] Params: 6,746,403 (relay: 76,384, 1.1%) Relay modules: 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 1: Relay Diagnostics — Drift, Gates, Anchor Geometry ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unet.mid_block1.relay: Patches: 16, Anchors/patch: 16, Patch dim: 16 Drift (rad): mean=0.096787 std=0.031425 min=0.025191 max=0.204950 Drift (deg): mean=5.55° max=11.74° Gates: mean=0.0594 std=0.0017 min=0.0560 max=0.0619 Patch 0: anchor_cos mean=0.0443 max=0.6491 min=-0.5109 Patch 1: anchor_cos mean=0.0329 max=0.6991 min=-0.6696 Patch 2: anchor_cos mean=-0.0137 max=0.5337 min=-0.6256 Patch 3: anchor_cos mean=0.0025 max=0.5802 min=-0.5198 Near 0.29154: 0.0% of anchors within ±0.05 Per-patch mean drift: Patch 0: 0.083389 rad (4.78°) Patch 1: 0.105304 rad (6.03°) Patch 2: 0.084447 rad (4.84°) Patch 3: 0.115432 rad (6.61°) Patch 4: 0.103072 rad (5.91°) Patch 5: 0.053514 rad (3.07°) Patch 6: 0.116086 rad (6.65°) Patch 7: 0.074704 rad (4.28°) Patch 8: 0.119158 rad (6.83°) Patch 9: 0.134659 rad (7.72°) Patch 10: 0.112943 rad (6.47°) Patch 11: 0.076762 rad (4.40°) Patch 12: 0.114448 rad (6.56°) Patch 13: 0.109745 rad (6.29°) Patch 14: 0.056048 rad (3.21°) Patch 15: 0.088878 rad (5.09°) unet.mid_block2.relay: Patches: 16, Anchors/patch: 16, Patch dim: 16 Drift (rad): mean=0.116423 std=0.031319 min=0.045233 max=0.246318 Drift (deg): mean=6.67° max=14.11° Gates: mean=0.0658 std=0.0019 min=0.0627 max=0.0702 Patch 0: anchor_cos mean=-0.0104 max=0.5409 min=-0.5740 Patch 1: anchor_cos mean=0.0241 max=0.5879 min=-0.5389 Patch 2: anchor_cos mean=0.0198 max=0.6510 min=-0.5917 Patch 3: anchor_cos mean=0.0060 max=0.5366 min=-0.7307 Near 0.29154: 0.4% of anchors within ±0.05 Per-patch mean drift: Patch 0: 0.119493 rad (6.85°) Patch 1: 0.139048 rad (7.97°) Patch 2: 0.096759 rad (5.54°) Patch 3: 0.159487 rad (9.14°) Patch 4: 0.123855 rad (7.10°) Patch 5: 0.094995 rad (5.44°) Patch 6: 0.129639 rad (7.43°) Patch 7: 0.077257 rad (4.43°) Patch 8: 0.127187 rad (7.29°) Patch 9: 0.114844 rad (6.58°) Patch 10: 0.118358 rad (6.78°) Patch 11: 0.097721 rad (5.60°) Patch 12: 0.128776 rad (7.38°) Patch 13: 0.127976 rad (7.33°) Patch 14: 0.086075 rad (4.93°) Patch 15: 0.121296 rad (6.95°) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 2: Bottleneck Feature Geometry — CV at the relay point ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ CV of bottleneck features at different timesteps: t module CV eff_d norm 0.00 unet.mid_block1.relay 0.5503 34.5 51.61 0.00 unet.mid_block1 0.5316 34.5 51.61 0.00 unet.mid_attn 0.4850 38.8 51.91 0.00 unet.mid_block2.relay 0.5030 40.8 63.07 0.00 unet.mid_block2 0.4623 40.8 63.07 0.25 unet.mid_block1.relay 0.4621 43.1 52.02 0.25 unet.mid_block1 0.4761 43.1 52.02 0.25 unet.mid_attn 0.4429 47.2 52.35 0.25 unet.mid_block2.relay 0.4353 48.7 63.47 0.25 unet.mid_block2 0.4243 48.7 63.47 0.50 unet.mid_block1.relay 0.4567 41.3 51.88 0.50 unet.mid_block1 0.4775 41.3 51.88 0.50 unet.mid_attn 0.4195 45.0 52.22 0.50 unet.mid_block2.relay 0.4303 46.3 63.28 0.50 unet.mid_block2 0.4266 46.3 63.28 0.75 unet.mid_block1.relay 0.4972 35.2 51.85 0.75 unet.mid_block1 0.5265 35.2 51.85 0.75 unet.mid_attn 0.4821 39.5 52.07 0.75 unet.mid_block2.relay 0.4743 40.7 63.12 0.75 unet.mid_block2 0.4939 40.7 63.12 1.00 unet.mid_block1.relay 0.5709 24.0 58.44 1.00 unet.mid_block1 0.5983 24.0 58.44 1.00 unet.mid_attn 0.5804 24.8 58.74 1.00 unet.mid_block2.relay 0.6099 25.0 70.53 1.00 unet.mid_block2 0.6065 25.0 70.53 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 3: Per-Class Anchor Utilization Which anchors activate for each class? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Nearest anchor distribution per class (Patch 0): class 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 plane 0% 0% 0% 0% 0% 0% 0% 22%█ 0% 0% 0% 0% 0% 0% 77%█ 0% auto 0% 0% 0% 0% 0% 0% 0% 20%█ 0% 0% 0% 0% 0% 0% 80%█ 0% bird 0% 0% 1% 0% 0% 0% 0% 42%█ 0% 0% 0% 0% 0% 0% 57%█ 0% cat 0% 0% 0% 0% 0% 0% 0% 40%█ 0% 0% 0% 0% 0% 0% 60%█ 0% deer 0% 0% 1% 0% 0% 0% 0% 58%█ 0% 0% 0% 0% 0% 0% 41%█ 0% dog 0% 0% 0% 0% 0% 0% 0% 39%█ 0% 0% 0% 0% 0% 0% 61%█ 0% frog 0% 0% 0% 0% 0% 0% 0% 58%█ 0% 0% 0% 0% 0% 0% 42%█ 0% horse 0% 0% 0% 0% 0% 0% 0% 57%█ 0% 0% 0% 0% 0% 0% 43%█ 0% ship 0% 0% 0% 0% 0% 0% 0% 2% 0% 0% 0% 0% 0% 0% 98%█ 0% truck 0% 0% 0% 0% 0% 0% 0% 54%█ 0% 0% 0% 0% 0% 0% 46%█ 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 4: Gate Dynamics — do relay gates respond to timestep? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Note: gates are learned parameters, not t-dependent. Measuring relay output magnitude at different t instead. t relay_Δ_norm relay_Δ_cos input_norm output_norm 0.00 116.3409 0.01136649 321.42 423.60 0.10 116.6277 0.01213849 323.16 424.56 0.25 116.7582 0.01237500 329.56 430.23 0.50 117.6399 0.01344323 331.23 431.26 0.75 118.0994 0.01383245 334.28 433.97 0.90 118.1683 0.01333457 350.58 449.37 1.00 119.9183 0.01262599 384.39 482.87 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 5: Generation Quality — Per-Class Diversity ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ class intra_cos intra_std CV norm plane 0.8748 0.0821 0.6407 33.17 auto 0.8073 0.0672 0.3739 27.07 bird 0.8779 0.0694 0.6898 27.62 cat 0.7880 0.0993 0.5233 26.13 deer 0.8695 0.0630 0.5286 26.79 dog 0.8332 0.0703 0.4365 27.46 frog 0.8332 0.0711 0.5007 25.48 horse 0.8313 0.0810 0.4687 29.01 ship 0.8920 0.0519 0.5069 32.44 truck 0.8311 0.0585 0.3973 29.58 ✓ Saved per-class grids to analysis/ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 6: Velocity Field — how does v_pred behave across t? ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ t v_norm v_std v·target v_cos_t 0.05 53.05 0.9593 0.8501 0.3426 0.10 55.90 1.0099 0.8958 0.2447 0.25 58.23 1.0513 0.9352 0.1561 0.50 58.96 1.0641 0.9486 0.1258 0.75 58.85 1.0615 0.9411 0.1444 0.90 57.49 1.0360 0.9244 0.1876 0.95 56.50 1.0187 0.9126 0.2168 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 7: Ablation — Relay ON vs OFF during generation Disable relay gates, measure generation difference ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Relay ON — mean pixel: 0.4528 std: 0.2227 Relay OFF — mean pixel: 0.4524 std: 0.2232 Pixel diff: 0.002449 Cosine sim: 0.999977 Max pixel Δ: 0.074433 ✓ Saved analysis/relay_ablation.png (top=ON, bottom=OFF) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 8: Anchor Constellation Structure ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unet.mid_block1.relay: Home↔Current cos: mean=0.994831 min=0.979071 Patch 0 anchor spread: mean_cos=0.0443 max_cos=0.6491 min_cos=-0.5109 Patch 1 anchor spread: mean_cos=0.0329 max_cos=0.6991 min_cos=-0.6696 Patch 2 anchor spread: mean_cos=-0.0137 max_cos=0.5337 min_cos=-0.6256 Patch 3 anchor spread: mean_cos=0.0025 max_cos=0.5802 min_cos=-0.5198 Patch 0 anchor eff_dim: 11.3 / 16 Patch 1 anchor eff_dim: 11.9 / 16 Patch 2 anchor eff_dim: 11.7 / 16 Patch 3 anchor eff_dim: 11.5 / 16 unet.mid_block2.relay: Home↔Current cos: mean=0.992746 min=0.969817 Patch 0 anchor spread: mean_cos=-0.0104 max_cos=0.5409 min_cos=-0.5740 Patch 1 anchor spread: mean_cos=0.0241 max_cos=0.5879 min_cos=-0.5389 Patch 2 anchor spread: mean_cos=0.0198 max_cos=0.6510 min_cos=-0.5917 Patch 3 anchor spread: mean_cos=0.0060 max_cos=0.5366 min_cos=-0.7307 Patch 0 anchor eff_dim: 11.8 / 16 Patch 1 anchor eff_dim: 11.9 / 16 Patch 2 anchor eff_dim: 12.0 / 16 Patch 3 anchor eff_dim: 11.4 / 16 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 9: Sampling Trajectory — CV through ODE steps ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ step t x_norm x_std CV_pixel 0 1.00 54.32 0.9801 0.0110 1 0.98 53.21 0.9602 0.0111 5 0.90 48.88 0.8819 0.0123 10 0.80 43.61 0.7869 0.0112 20 0.60 33.95 0.6123 0.0241 30 0.40 26.37 0.4764 0.0707 40 0.20 22.88 0.4184 0.1773 49 0.02 24.85 0.4587 0.2497 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ TEST 10: Inter-Class vs Intra-Class Separation ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Inter-class cosine similarity matrix: plan auto bird cat deer dog frog hors ship truc plane 1.0 0.99 0.99 0.98 0.98 0.97 0.98 0.99 1.00 0.98 auto 0.99 1.0 0.99 0.98 0.98 0.97 0.98 0.99 0.99 0.99 bird 0.99 0.99 1.0 0.99 0.99 0.99 0.99 1.00 0.99 0.98 cat 0.98 0.98 0.99 1.0 0.99 1.00 0.99 0.99 0.97 0.96 deer 0.98 0.98 0.99 0.99 1.0 0.99 1.00 0.99 0.97 0.96 dog 0.97 0.97 0.99 1.00 0.99 1.0 0.99 0.98 0.96 0.95 frog 0.98 0.98 0.99 0.99 1.00 0.99 1.0 0.99 0.97 0.97 horse 0.99 0.99 1.00 0.99 0.99 0.98 0.99 1.0 0.98 0.98 ship 1.00 0.99 0.99 0.97 0.97 0.96 0.97 0.98 1.0 0.99 truck 0.98 0.99 0.98 0.96 0.96 0.95 0.97 0.98 0.99 1.0 Intra-class cos: 0.8438 ± 0.0318 Inter-class cos: 0.8305 ± 0.0230 Separation ratio: 1.02× ================================================================================ ANALYSIS COMPLETE ================================================================================ Files saved to analysis/: - class_*.png: per-class generated samples - all_classes.png: 4 samples per class, 10 columns - relay_ablation.png: relay ON (top) vs OFF (bottom) Key metrics to look for: 1. Anchor drift → did any converge near 0.29154? 2. Gate values → did they learn to open from init (0.047)? 3. Per-class anchor utilization → class-specific routing? 4. Relay ablation → does turning off the relay change generation? 5. Intra/inter-class ratio → > 1.0 means classes are separable 6. Velocity cosine → higher = better flow matching 7. CV through ODE → how does geometry evolve during generation? ================================================================================