EnyaWoooo
/

CTFlow

+---
+license: apache-2.0
+tags:
+  - medical-imaging
+  - ct-generation
+  - flow-matching
+  - diffusion
+  - text-to-3d
+  - auto-regressive
+---
+# CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis
+**ICCV 2025 Workshop on Vision-Language Models for 3D Understanding (VLM3D)**
+[[Paper]](https://openaccess.thecvf.com/content/ICCV2025W/VLM3D/papers/Wang_CTFlow_Video-Inspired_Latent_Flow_Matching_for_3D_CT_Synthesis_ICCVW_2025_paper.pdf) | [[GitHub]](https://github.com/WongJiayi/CTFlow)
+---
+## Overview
+CTFlow is a **0.5B latent flow matching transformer** for generating entire 3D CT volumes conditioned on clinical reports.
+Key ideas:
+- Uses the **FLUX A-VAE** as the latent space encoder/decoder
+- Encodes clinical reports with the **CT-CLIP text encoder**
+- Generates CT volumes **auto-regressively block-by-block**, keeping memory tractable while maintaining temporal coherence across slices
+- Trained on **CT-RATE**, a large-scale dataset of 3D CT volumes paired with clinical reports
+---
+## Checkpoint
+This repository contains the pretrained **STDiT-L2** checkpoint (512M parameters, trained for 680,000 steps):
+```
+checkpoint-680000/
+└── denoiser_ema/     ← use this for inference
+```
+---
+## Usage
+See the [GitHub repository](https://github.com/WongJiayi/CTFlow) for full installation instructions, training configs, and inference scripts.
+**Quick inference:**
+```bash
+git clone https://github.com/WongJiayi/CTFlow
+cd CTFlow
+python auto_regressive_generate/main.py \
+    --config /path/to/config.yaml \
+    --ckpt /path/to/checkpoint-680000/denoiser_ema \
+    --embedding /path/to/ct_embedding.pt \
+    --output output_frames/ \
+    --type full-body
+```
+---
+## Citation
+```bibtex
+@InProceedings{Wang_2025_ICCVW,
+    author    = {Wang, Jiayi and Reynaud, Hadrien and Erick, Franciskus Xaverius and Kainz, Bernhard},
+    title     = {CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis},
+    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
+    year      = {2025},
+}
+```