Re-CatVTON / README.md
levinna's picture
Update README.md
7a0299c verified
metadata
license: cc-by-nc-4.0
tags:
  - virtual-try-on
  - diffusers
  - stable-diffusion
  - image-to-image
datasets:
  - VITON-HD
  - DressCode
base_model:
  - stable-diffusion-v1-5/stable-diffusion-inpainting
pipeline_tag: image-to-image
language:
  - en
library_name: diffusers

Re-CatVTON

Official model weights for "Rethinking Garment Conditioning in Diffusion-based Virtual Try-On".

📄 Paper: Re-CatVTON
💻 Code: GitHub

Available Checkpoints

Dataset Subfolder Resolution
VITON-HD VITON-HD/checkpoint-16000/unet 512×384
DressCode DressCode/checkpoint-32000/unet 512×384

Usage

import torch
from diffusers import AutoencoderKL, UNet2DConditionModel, DDPMScheduler
from model.pipeline import RECATVTONPipeline
from model.attn_processor import SkipAttnProcessor
from model.utils import init_adapter

device = "cuda"
dtype = torch.bfloat16

# Load components
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse").to(device, dtype)

# Choose one:
unet = UNet2DConditionModel.from_pretrained(
    "levinna/Re-CatVTON", 
    subfolder="VITON-HD/checkpoint-16000/unet"  # or "DressCode/checkpoint-32000/unet"
).to(device, dtype)

scheduler = DDPMScheduler.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-inpainting", # or can use Re-CatVTON scheduler config
    subfolder="scheduler"
)

# Initialize attention processors (disable cross-attention)
init_adapter(unet, cross_attn_cls=SkipAttnProcessor)

# Create pipeline
pipeline = RECATVTONPipeline(vae=vae, unet=unet, scheduler=scheduler)

You can check more detailed instructions on Official GitHub

License

This model is licensed under CC BY-NC 4.0 due to the usage of non-commercial datasets (VITON-HD, DressCode).

Citation

@article{na2025rethinking,
  title={Rethinking Garment Conditioning in Diffusion-based Virtual Try-On},
  author={Na, Kihyun and Choi, Jinyoung and Kim, Injung},
  journal={arXiv preprint arXiv:2511.18775},
  year={2025}
}