PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Urban Scenes

PrITTI teaser

This repository hosts the pre-trained checkpoints for PrITTI (CVPR 2026), a latent-diffusion framework for controllable and editable 3D semantic urban scene generation.

Existing approaches to 3D semantic urban scene generation predominantly rely on voxel-based representations. In contrast, PrITTI advocates for a primitive-based paradigm where urban scenes are represented using compact, semantically meaningful 3D elements that are easy to manipulate and compose. PrITTI achieves state-of-the-art 3D scene generation quality with lower memory requirements and faster inference than voxel-based methods.

Released Checkpoints

The checkpoints below were trained on KITTI-360.

File	Size	Description
`lvae.ckpt`	1.1 GB	Layout Variational Autoencoder, trained for 300 epochs (`epoch=299, step=580200`).
`ldm_b/`	773 MB	DiT-B Latent Diffusion Model in `diffusers`-pipeline format (`model_index.json` + `transformer/` + `decoder/` + `scheduler/`).

Quick Start

Full environment setup, preprocessing, training, inference, and evaluation instructions live in the official GitHub repository. The snippet below downloads both checkpoints into the locations the code expects:

# Make sure these are set (also documented in the main README)
export LVAE_TIMESTAMP="2025.06.03.17.23.30"
export LVAE_EPOCH="299"
export LVAE_STEP="580200"

# LVAE checkpoint
LVAE_DIR=$PRITTI_EXP_ROOT/exp/training_lvae_model/training_lvae_model/$LVAE_TIMESTAMP/checkpoints
mkdir -p $LVAE_DIR
huggingface-cli download raniatze/pritti-checkpoints lvae.ckpt --local-dir $LVAE_DIR
mv $LVAE_DIR/lvae.ckpt $LVAE_DIR/epoch=$LVAE_EPOCH-step=$LVAE_STEP.ckpt

# LDM (DiT-B) checkpoint
LDM_DIR=$PRITTI_EXP_ROOT/exp/training_dit_model/training_dit_b_model/training_dit_b_model/$LVAE_TIMESTAMP
mkdir -p $LDM_DIR
huggingface-cli download raniatze/pritti-checkpoints --include "ldm_b/*" --local-dir $LDM_DIR
mv $LDM_DIR/ldm_b $LDM_DIR/checkpoint

Once downloaded, follow the Inference section of the main README to reconstruct and generate scenes.

License

Released under CC BY-NC 4.0 — free for academic and non-commercial research use. See LICENSE for full terms.

Citation

If you find PrITTI useful, please cite:

@inproceedings{Tze2026PrITTI,
    author    = {Tze, Christina Ourania and Dauner, Daniel and Liao, Yiyi and Tsishkou, Dzmitry and Geiger, Andreas},
    title     = {PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Scenes},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2026},
}

Downloads last month: 10

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for raniatze/pritti-checkpoints

PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Urban Scenes

Paper • 2506.19117 • Published Jun 23, 2025