windinet / README.md

Update README.md

66e8c7e verified 4 days ago

4.69 kB

	---
	license: apache-2.0
	tags:
	- wind-simulation
	- cfd
	- video-diffusion
	- urban-design
	- physics-informed
	- surrogate-model
	---

	# WinDiNet: Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

	[![arXiv](https://img.shields.io/badge/arXiv-2603.21210-b31b1b.svg)](https://arxiv.org/abs/2603.21210)
	[![GitHub](https://img.shields.io/badge/GitHub-rbischof/windinet-blue?logo=github)](https://github.com/rbischof/windinet)
	[![Web Demo](https://img.shields.io/badge/Web%20Demo-Launch-orange)](https://rbischof.github.io/windinet_web/)

	WinDiNet repurposes a 2-billion-parameter video diffusion transformer ([LTX-Video](https://github.com/Lightricks/LTX-Video)) as a fast, differentiable surrogate for computational fluid dynamics (CFD) simulations of urban wind patterns. Fine-tuned on 10,000 CFD simulations across procedurally generated building layouts, it generates complete 112-frame wind field rollouts in under one second — over 2,000x faster than the ground truth CFD solver.

	- Physics-informed VAE decoder: Fine-tuned with incompressibility and wall boundary losses for physically consistent velocity field reconstruction
	- Scalar conditioning: Fourier-feature-encoded inlet speed and domain size replace text prompts, enabling precise physical parametrisation
	- Differentiable end-to-end: Enables gradient-based inverse design of urban building layouts for pedestrian wind comfort
	- State-of-the-art accuracy: Outperforms specialised neural operators (FNO, OFormer, Poseidon, U-Net) on vRMSE, spectral divergence, and Wasserstein distance

	## Model Weights

	This repository contains three checkpoint files:

	\| File \| Description \| Parameters \| Size \|
	\|------\|-------------\|------------\|------\|
	\| `dit.safetensors` \| Fine-tuned diffusion transformer \| 1.92B \| 7.7 GB \|
	\| `scalar_embedding.safetensors` \| Fourier feature scalar conditioning module \| 4.3M \| 17 MB \|
	\| `vae_decoder.safetensors` \| Physics-informed VAE decoder \| 553M \| 2.2 GB \|

	### Download

	Checkpoints are downloaded automatically when using the windinet package. For manual download:

	```bash
	# Using Hugging Face CLI
	huggingface-cli download rabischof/windinet --local-dir checkpoints/

	# Or individual files
	huggingface-cli download rabischof/windinet dit.safetensors
	huggingface-cli download rabischof/windinet scalar_embedding.safetensors
	huggingface-cli download rabischof/windinet vae_decoder.safetensors
	```

	## Installation

	```bash
	git clone https://github.com/rbischof/windinet.git
	cd windinet
	pip install -e .
	```

	## Inference

	Each input sample is a building footprint PNG (black=building, white=fluid) paired with a JSON file specifying inlet conditions:

	```json
	{"inlet_speed_mps": 10.0, "field_size_m": 1400}
	```

	Run inference:

	```bash
	python scripts/inference.py configs/inference.yaml \
	--input_dir examples/footprints/ \
	--out_dir predictions/
	```

	Outputs per sample: `.npz` (u/v velocity fields in m/s, float16) and `.mp4` (wind magnitude video).

	## Training

	WinDiNet training has two stages:

	Stage 1: VAE decoder fine-tuning with physics-informed losses (incompressibility + wall boundary conditions):

	```bash
	python scripts/finetune_vae.py configs/finetune_vae.yaml
	```

	Stage 2: Diffusion transformer training with scalar conditioning:

	```bash
	python scripts/train.py configs/windinet_scalar.yaml
	```

	See the [GitHub repository](https://github.com/rbischof/windinet) for dataset preparation and configuration details.

	## Inverse Design

	WinDiNet serves as a differentiable surrogate for gradient-based optimisation of building layouts:

	```bash
	python scripts/inverse_design.py configs/inverse_opt.yaml
	```

	The optimizer adjusts building positions to minimise a Pedestrian Wind Comfort (PWC) loss. The framework is extensible to custom objectives and building parametrisations — see `inverse/objective.py` and `inverse/footprint.py`.

	## Citation

	```bibtex
	@article{perini2025windinet,
	title={Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows},
	author={Perini, Janne and Bischof, Rafael and Arar, Moab and Duran, Ay{\c{c}}a and Kraus, Michael A. and Mishra, Siddhartha and Bickel, Bernd},
	journal={arXiv preprint arXiv:2603.21210},
	year={2026}
	}
	```

	## Details

	- License: Apache 2.0
	- Base model: [LTX-Video 2B v0.9.6](https://huggingface.co/Lightricks/LTX-Video)
	- Training data: 10,000 CFD simulations (256x256, 112 frames each)
	- arXiv: [2603.21210](https://arxiv.org/abs/2603.21210)
	- Authors: Janne Perini\, Rafael Bischof\, Moab Arar, Ayca Duran, Michael A. Kraus, Siddhartha Mishra, Bernd Bickel (\* equal contribution)