| --- |
| license: apache-2.0 |
| tags: |
| - wind-simulation |
| - cfd |
| - video-diffusion |
| - urban-design |
| - physics-informed |
| - surrogate-model |
| --- |
| |
| # WinDiNet: Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows |
|
|
| [](https://arxiv.org/abs/2603.21210) |
| [](https://github.com/rbischof/windinet) |
| [](https://rbischof.github.io/windinet_web/) |
|
|
| **WinDiNet** repurposes a 2-billion-parameter video diffusion transformer ([LTX-Video](https://github.com/Lightricks/LTX-Video)) as a fast, differentiable surrogate for computational fluid dynamics (CFD) simulations of urban wind patterns. Fine-tuned on 10,000 CFD simulations across procedurally generated building layouts, it generates complete **112-frame wind field rollouts in under one second** — over 2,000x faster than the ground truth CFD solver. |
|
|
| - **Physics-informed VAE decoder**: Fine-tuned with incompressibility and wall boundary losses for physically consistent velocity field reconstruction |
| - **Scalar conditioning**: Fourier-feature-encoded inlet speed and domain size replace text prompts, enabling precise physical parametrisation |
| - **Differentiable end-to-end**: Enables gradient-based inverse design of urban building layouts for pedestrian wind comfort |
| - **State-of-the-art accuracy**: Outperforms specialised neural operators (FNO, OFormer, Poseidon, U-Net) on vRMSE, spectral divergence, and Wasserstein distance |
|
|
| ## Model Weights |
|
|
| This repository contains three checkpoint files: |
|
|
| | File | Description | Parameters | Size | |
| |------|-------------|------------|------| |
| | `dit.safetensors` | Fine-tuned diffusion transformer | 1.92B | 7.7 GB | |
| | `scalar_embedding.safetensors` | Fourier feature scalar conditioning module | 4.3M | 17 MB | |
| | `vae_decoder.safetensors` | Physics-informed VAE decoder | 553M | 2.2 GB | |
|
|
| ### Download |
|
|
| Checkpoints are downloaded automatically when using the windinet package. For manual download: |
|
|
| ```bash |
| # Using Hugging Face CLI |
| huggingface-cli download rabischof/windinet --local-dir checkpoints/ |
| |
| # Or individual files |
| huggingface-cli download rabischof/windinet dit.safetensors |
| huggingface-cli download rabischof/windinet scalar_embedding.safetensors |
| huggingface-cli download rabischof/windinet vae_decoder.safetensors |
| ``` |
|
|
| ## Installation |
|
|
| ```bash |
| git clone https://github.com/rbischof/windinet.git |
| cd windinet |
| pip install -e . |
| ``` |
|
|
| ## Inference |
|
|
| Each input sample is a building footprint PNG (black=building, white=fluid) paired with a JSON file specifying inlet conditions: |
|
|
| ```json |
| {"inlet_speed_mps": 10.0, "field_size_m": 1400} |
| ``` |
|
|
| Run inference: |
|
|
| ```bash |
| python scripts/inference.py configs/inference.yaml \ |
| --input_dir examples/footprints/ \ |
| --out_dir predictions/ |
| ``` |
|
|
| Outputs per sample: `.npz` (u/v velocity fields in m/s, float16) and `.mp4` (wind magnitude video). |
|
|
| ## Training |
|
|
| WinDiNet training has two stages: |
|
|
| **Stage 1: VAE decoder fine-tuning** with physics-informed losses (incompressibility + wall boundary conditions): |
|
|
| ```bash |
| python scripts/finetune_vae.py configs/finetune_vae.yaml |
| ``` |
|
|
| **Stage 2: Diffusion transformer training** with scalar conditioning: |
|
|
| ```bash |
| python scripts/train.py configs/windinet_scalar.yaml |
| ``` |
|
|
| See the [GitHub repository](https://github.com/rbischof/windinet) for dataset preparation and configuration details. |
|
|
| ## Inverse Design |
|
|
| WinDiNet serves as a differentiable surrogate for gradient-based optimisation of building layouts: |
|
|
| ```bash |
| python scripts/inverse_design.py configs/inverse_opt.yaml |
| ``` |
|
|
| The optimizer adjusts building positions to minimise a Pedestrian Wind Comfort (PWC) loss. The framework is extensible to custom objectives and building parametrisations — see `inverse/objective.py` and `inverse/footprint.py`. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{perini2025windinet, |
| title={Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows}, |
| author={Perini, Janne and Bischof, Rafael and Arar, Moab and Duran, Ay{\c{c}}a and Kraus, Michael A. and Mishra, Siddhartha and Bickel, Bernd}, |
| journal={arXiv preprint arXiv:2603.21210}, |
| year={2026} |
| } |
| ``` |
|
|
| ## Details |
|
|
| - **License**: Apache 2.0 |
| - **Base model**: [LTX-Video 2B v0.9.6](https://huggingface.co/Lightricks/LTX-Video) |
| - **Training data**: 10,000 CFD simulations (256x256, 112 frames each) |
| - **arXiv**: [2603.21210](https://arxiv.org/abs/2603.21210) |
| - **Authors**: Janne Perini\*, Rafael Bischof\*, Moab Arar, Ayca Duran, Michael A. Kraus, Siddhartha Mishra, Bernd Bickel (\* equal contribution) |
|
|