PID / README.md
FerrisMao's picture
Update README.md
9d489da verified
# PID: Physics-Informed Diffusion Model for Infrared Image Generation
<img src="PID.png" alt="PID" style="zoom:50%;" />
## Update
* 2025/05 The paper is accepted by Pattern Recognition: https://doi.org/10.1016/j.patcog.2025.111816
* Arxiv version: [2407.09299](https://arxiv.org/abs/2407.09299)
* We have released our code.
## Environment
It is recommended to install the environment with environment.yaml.
```bash
conda env create --file=environment.yaml
```
## Datasets
Download **KAIST** dataset from https://github.com/SoonminHwang/rgbt-ped-detection
Download **FLIRv1** dataset from https://www.flir.com/oem/adas/adas-dataset-form/
We adopt the official dataset split in our experiments.
## Checkpoint
VQGAN can be downloaded from https://ommer-lab.com/files/latent-diffusion/vq-f8.zip (Other GAN models can be downloaded from https://github.com/CompVis/latent-diffusion).
TeVNet and PID heckpoints can be found in [HuggingFace](https://huggingface.co/FerrisMao/PID).
## Evaluation
Use the shellscript to evaluate. `indir` is the input directory of visible RGB images, `outdir` is the output directory of translated infrared images, `config` is the chosen config in `configs/latent-diffusion/config.yaml`. We prepare some RGB images in `dataset/KAIST` for quick evaluation.
```sh
bash run_test_kaist512_vqf8.sh
```
## Train
### Dataset preparation
Prepare corresponding RGB and infrared images with same names in two directories.
### Stage 1: Train TeVNet
```bash
cd TeVNet
bash shell/train.sh
```
### Stage 2: Train PID
To accelerate training, we recommend using our pretrained model.
```bash
bash shell/run_train_kaist512_vqf8.sh
```
## Acknowledgements
Our code is built upon [LDM](https://github.com/CompVis/latent-diffusion) and [HADAR](https://github.com/FanglinBao/HADAR). We thank the authors for their excellent work.
## Citation
If you find this work is helpful in your research, please consider citing our paper:
```
@article{mao2026pid,
title={PID: physics-informed diffusion model for infrared image generation},
author={Mao, Fangyuan and Mei, Jilin and Lu, Shun and Liu, Fuyang and Chen, Liang and Zhao, Fangzhou and Hu, Yu},
journal={Pattern Recognition},
volume={169},
pages={111816},
year={2026},
publisher={Elsevier}
}
```
If you have any question, feel free to contact maofangyuan23s@ict.ac.cn.