PIC-Flow
A physics-embedded flow-matching neural surrogate that replaces FDTD for full-field electromagnetic prediction of silicon photonic devices. Given a permittivity map ε(x,y), a source-port mask, and a free-space wavelength λ, PIC-Flow generates the complex field E_z in a single multi-step ODE integration — typically in well under a second on a single A100, vs. seconds-to-minutes for CPU FDTD.
This repo hosts the FM + phase + residual checkpoint from epoch 300 (the headline model from the paper). All training code, dataset-generation tooling, and inference notebooks live in the GitHub repo: Rizzo-Integrated-Photonic-Systems-Lab/PIC-Flow.
Files
| Path | Description |
|---|---|
checkpoints/phase_residual_300.pt |
FM+phase+residual U-Net, epoch 300, ~1 GB. |
Quick usage
pip install huggingface_hub torch numpy
from huggingface_hub import hf_hub_download
import torch
ckpt_path = hf_hub_download(
"RizzoLab/PIC-Flow",
"checkpoints/phase_residual_300.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
# ckpt["state_dict"] -> model weights (real-valued U-Net, 63.3M params)
# ckpt["stats"] -> field/permittivity normalization stats
# ckpt["args"] -> training hyperparameters
End-to-end inference (load model, build conditioning, run flow-matching sampler) is
covered by tools/predict_parametric_device.py
and notebooks/03_inference.ipynb
in the GitHub repo.
Model
- Architecture: real-valued U-Net, 63.3M parameters. Real and imaginary E_z
components enter as separate input channels; the permittivity and source-mask maps
are visible at every layer; the flow-matching integration time
tand the wavelengthλenter as scalar conditioning inputs. - Generative framework: conditional flow matching (Lipman et al., 2023). Inference integrates a learned velocity field from Gaussian noise to a physically valid E_z using Euler or Heun ODE steps.
- Physics constraint: masked Helmholtz residual loss
L_res(PML, source, and dielectric-interface pixels excluded), with a per-sample compliance metricρ_R = sqrt(L_res) × 100%.
Training data
- 22,500 Meep FDTD simulations at λ = 1.55 µm
- Three device families: 2×2 MMIs, Y-branches, directional couplers (7,500 each)
- Latin-hypercube parameter sweeps over geometric variables per family
- 18,000 / 2,250 / 2,250 train / val / test split
Training: 300 epochs on 12 NVIDIA V100 GPUs, identical hyperparameters across the three ablation runs (FM only, FM+phase, FM+phase+residual).
Performance
On the held-out test split (200-step Heun sampler):
| Device family | ρ_R |
|---|---|
| 2×2 MMI | 2.7% |
| Y-branch | 2.5% |
| Directional coupler | 2.2% |
Out-of-distribution (same checkpoint, geometries never seen during training):
| Device | ρ_R |
|---|---|
| Aggressive Euler S-bend (tight R, large offset) | 12% |
| Short, steep taper | 4.0% |
| Long, wide taper | 3.6% |
| Cascaded 1×3 Y-branch (new device class) | 9.1% |
Wall clock on a single NVIDIA A100 (fp16 autocast, vs. 16-thread Meep FDTD on the same node):
| Sampler | Wall time | Speedup | ρ_R |
|---|---|---|---|
| FDTD (reference) | 5.61 s | 1.0× | (reference) |
| Euler 100 step | 2.19 s | 2.6× | 1.9% |
| Euler 20 step | 440 ms | 12.7× | 3.0% |
| Euler 5 step | 110 ms | 50.6× | 5.5% |
Citation
@article{Quaratiello2026PICFlow,
author = {Joseph Quaratiello and Anthony Rizzo},
title = {A Physics-Embedded Flow-Matching Model for Electromagnetic Prediction
of Silicon Photonic Devices},
journal = {arXiv},
year = {2026}
}
License
MIT. See LICENSE in the GitHub repo.