File size: 2,321 Bytes
afe1ef4 7a0299c afe1ef4 33b9c82 afe1ef4 33b9c82 afe1ef4 33b9c82 afe1ef4 788efaa afe1ef4 7a0299c afe1ef4 7a0299c afe1ef4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
license: cc-by-nc-4.0
tags:
- virtual-try-on
- diffusers
- stable-diffusion
- image-to-image
datasets:
- VITON-HD
- DressCode
base_model:
- stable-diffusion-v1-5/stable-diffusion-inpainting
pipeline_tag: image-to-image
language:
- en
library_name: diffusers
---
# Re-CatVTON
Official model weights for **"Rethinking Garment Conditioning in Diffusion-based Virtual Try-On"**.
📄 **Paper**: [Re-CatVTON](https://arxiv.org/abs/2511.18775)
💻 **Code**: [GitHub](https://github.com/Levinna/Re-CatVTON)
## Available Checkpoints
| Dataset | Subfolder | Resolution |
|---------|-----------|------------|
| VITON-HD | `VITON-HD/checkpoint-16000/unet` | 512×384 |
| DressCode | `DressCode/checkpoint-32000/unet` | 512×384 |
## Usage
```python
import torch
from diffusers import AutoencoderKL, UNet2DConditionModel, DDPMScheduler
from model.pipeline import RECATVTONPipeline
from model.attn_processor import SkipAttnProcessor
from model.utils import init_adapter
device = "cuda"
dtype = torch.bfloat16
# Load components
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse").to(device, dtype)
# Choose one:
unet = UNet2DConditionModel.from_pretrained(
"levinna/Re-CatVTON",
subfolder="VITON-HD/checkpoint-16000/unet" # or "DressCode/checkpoint-32000/unet"
).to(device, dtype)
scheduler = DDPMScheduler.from_pretrained(
"stable-diffusion-v1-5/stable-diffusion-inpainting", # or can use Re-CatVTON scheduler config
subfolder="scheduler"
)
# Initialize attention processors (disable cross-attention)
init_adapter(unet, cross_attn_cls=SkipAttnProcessor)
# Create pipeline
pipeline = RECATVTONPipeline(vae=vae, unet=unet, scheduler=scheduler)
```
You can check more detailed instructions on Official [GitHub](https://github.com/Levinna/Re-CatVTON)
## License
This model is licensed under CC BY-NC 4.0 due to the usage of non-commercial datasets (VITON-HD, DressCode).
- **Model Weights**: [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)
- **Code**: [CC-BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)
## Citation
```bibtex
@article{na2025rethinking,
title={Rethinking Garment Conditioning in Diffusion-based Virtual Try-On},
author={Na, Kihyun and Choi, Jinyoung and Kim, Injung},
journal={arXiv preprint arXiv:2511.18775},
year={2025}
}
``` |