EnyaWoooo
/

CTFlow

medical-imaging

auto-regressive

Model card Files Files and versions

CTFlow / README.md

EnyaWoooo's picture

Upload README.md with huggingface_hub

f9945e2 verified 12 days ago

|

history blame contribute delete

2.08 kB

	---
	license: apache-2.0
	tags:
	- medical-imaging
	- ct-generation
	- flow-matching
	- diffusion
	- text-to-3d
	- auto-regressive
	---

	# CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis

	ICCV 2025 Workshop on Vision-Language Models for 3D Understanding (VLM3D)

	[[Paper]](https://openaccess.thecvf.com/content/ICCV2025W/VLM3D/papers/Wang_CTFlow_Video-Inspired_Latent_Flow_Matching_for_3D_CT_Synthesis_ICCVW_2025_paper.pdf) \| [[GitHub]](https://github.com/WongJiayi/CTFlow)

	---

	## Overview

	CTFlow is a 0.5B latent flow matching transformer for generating entire 3D CT volumes conditioned on clinical reports.

	Key ideas:
	- Uses the FLUX A-VAE as the latent space encoder/decoder
	- Encodes clinical reports with the CT-CLIP text encoder
	- Generates CT volumes auto-regressively block-by-block, keeping memory tractable while maintaining temporal coherence across slices
	- Trained on CT-RATE, a large-scale dataset of 3D CT volumes paired with clinical reports

	---

	## Checkpoint

	This repository contains the pretrained STDiT-L2 checkpoint (512M parameters, trained for 680,000 steps):

	```
	checkpoint-680000/
	└── denoiser_ema/ ← use this for inference
	```

	---

	## Usage

	See the [GitHub repository](https://github.com/WongJiayi/CTFlow) for full installation instructions, training configs, and inference scripts.

	Quick inference:

	```bash
	git clone https://github.com/WongJiayi/CTFlow
	cd CTFlow

	python auto_regressive_generate/main.py \
	--config /path/to/config.yaml \
	--ckpt /path/to/checkpoint-680000/denoiser_ema \
	--embedding /path/to/ct_embedding.pt \
	--output output_frames/ \
	--type full-body
	```

	---

	## Citation

	```bibtex
	@InProceedings{Wang_2025_ICCVW,
	author = {Wang, Jiayi and Reynaud, Hadrien and Erick, Franciskus Xaverius and Kainz, Bernhard},
	title = {CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis},
	booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
	year = {2025},
	}
	```