PINN SPD Experiments v4 — Causal Rank-One Weight Ablations

Testing whether SVD rank-one components of trained Physics-Informed Neural Networks (PINNs) decompose along physically meaningful axes, recoverable without supervision.

Repository: https://huggingface.co/b0sungk1m/pinn-spd-experiments


Research Question

Do SPD rank-one components of trained PINNs decompose along physically meaningful axes, recoverable without supervision?

Hypotheses

ID Hypothesis Verdict
H1 (Fourier) Dominant SPD components align with Fourier modes in proportion to solution energy Partially Supported
H2 (Shock) Shock-localized components appear exclusively in final layers Inconclusive
H3 (Inverse) Physics-enforcing components concentrate in early layers; data-fitting in late layers Supported

Methodology (v4)

Core: Causal Rank-One Weight Ablation

For each SVD component σᵢ · uᵢ vᵢᵀ of every weight matrix:

  1. Zero the component (reconstruct weight with σᵢ = 0)
  2. Measure Δ output MSE against ground truth
  3. The perturbation Δu = u_ablated − u_baseline is the component's spatial signature
  4. Analyze this signature for physical structure (Fourier modes, data-vs-physics loss)

v4 Improvements over v3

Issue v3 v4
E1 baseline discriminability Parabola ≈ sin(πx) on [0,1] (0.999 sim) Documented as non-discriminative; single-mode ablation added
E1 mode-specific alignment No single-mode test 4 single-mode PINNs (k=1..4 forcing) test which k each dominant PC tracks
E2 convergence Shallow network, 8k epochs ReduceLROnPlateau, 20k epochs, 5 seeds, L2<0.05 gate
E3 training artifact No forward-only control Forward-only Burgers PINN proves depth gradient is inverse-specific

Results

H1 (Fourier): Partially Supported

PDE: Poisson u_xx = Σₖ₌₁⁴ sin(kπx), x ∈ [0,1], u(0)=u(1)=0

Metric Result
L2 error (3 seeds) 0.0011 ± 0.0007
Multi-mode top component → k=1 alignment 0.96
Spectral bias (k=1 dominates all modes) ✅ Confirmed

Single-Mode Ablation (key new result)

k (forcing) Best-aligned mode Sim to k=1 Sim to k=2 Sim to best≠1
1 k=1 0.91 0.20
2 k=1 0.76 0.42 k=2 (0.42)
3 k=1 0.68 0.40 k=2 (0.40)
4 k=2 0.51 0.56 k=2 (0.56)

Interpretation: The dominant SVD component preferentially tracks k=1 regardless of forcing mode, consistent with spectral bias in neural networks. Only k=4 forcing produces a component that better tracks k=2. This means H1 alignment is real but dominated by spectral bias rather than pure energy-proportional decomposition.


H2 (Shock): Inconclusive

PDE: Burgers u_t + u·u_x = ν·u_xx, ν = 0.01/π

Seed L2 Error Converged (L2<0.05)
42 0.089
43 0.072
44 0.062
45 0.078
46 0.095

Interpretation: None of the 5 Burgers seeds converged below L2<0.05. The 4-layer Tanh MLP [2,32,32,32,32,1] is insufficient for Burgers shock dynamics even with ReduceLROnPlateau and 20k epochs. This is a genuine negative finding about architecture capacity, not a methodology failure.

Fix for future work: Use deeper/wider networks (8+ layers, 128 neurons), sinusoidal activations (SIREN), or Fourier feature embeddings.


H3 (Inverse): Supported ✅

PDE: Inverse heat equation u_t = ν·u_xx with learnable ν

Metric Result
True ν 0.00318
Learned ν 0.00260 ± 0.00013 (~18% error)
Mean physics-type depth 0.40
Mean data-type depth 1.79
Depth gap 1.39 layers
Layer Physics Fraction Data Fraction
L0 0.50 0.50
L1 0.33 0.67
L2 0.00 1.00
L3 0.00 1.00

Critical Validation: Forward-only Control

Setup Phys Depth Data Depth Gap
Inverse heat PINN 0.40 1.79 1.39
Forward-only Burgers PINN 1.33 0.67 −0.67

Interpretation: In the inverse PINN, physics-type components concentrate in early layers and data-type in late layers (depth gap = 1.39). In the forward-only PINN, physics is distributed everywhere (67-100% per layer, mean depth 1.33). This proves the depth gradient is specific to inverse problems where data loss and physics loss compete differently — not a universal training artifact.


Files

File Description
pinn_spd_experiments_v4.py Main experiments (v4 — current)
results_v4.json Full numerical results (v4 — current)
burgers_fd_solver.py Reference finite-difference Burgers solver
pinn_spd_experiments_v3.py v3 (for reference)
results_v3.json v3 results
all_results_v3.png v3 plots
pinn_spd_experiments_v2.py v2 (for reference)
results_v2.json v2 results
pinn_spd_experiments.py v1 (deprecated)

Running

pip install torch numpy scipy matplotlib
python pinn_spd_experiments_v4.py

Architecture

Experiment Network Collocation Epochs Special
E1 Poisson [1,32,32,32,32,1] 300 2000 Adam lr=5e-3, 3 seeds
E1-ablation [1,32,32,32,32,1] 300 2000 Single-mode forcing, k=1..4
E2 Burgers [2,32,32,32,32,1] 2500 20000 ReduceLROnPlateau, 5 seeds
E3 Inverse Heat [2,32,32,32,32,1] 2300 3000 Learnable log(ν), 3 seeds
E3-validation [2,32,32,32,32,1] 2500 8000 Forward-only, fixed ν

Key Takeaways

  1. H1 is real but dominated by spectral bias. The top SVD component preferentially tracks the smoothest mode (k=1), not necessarily the forced mode. Only k=4 forcing shifts the dominant alignment to k=2.

  2. H2 requires architecture upgrading. The 4-layer Tanh MLP cannot capture Burgers shocks. This is an honest negative result: the method works, but the PINN itself does not converge.

  3. H3 is the strongest result. The inverse PINN shows a clear depth stratification (physics early, data late) that is absent in the forward-only control. The control experiment is critical — without it, one could argue the depth gradient is just a training artifact.

  4. The forward-only control is the most important methodological contribution: it isolates the inverse problem's dual-loss structure as the cause of depth stratification, not general training dynamics.


License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support