Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,96 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-nc-sa-4.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-sa-4.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
# <center> π CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
|
| 8 |
+
|
| 9 |
+
<p align="center">
|
| 10 |
+
<a href="https://github.com/Zheng-Chong/CatVTON">
|
| 11 |
+
<img src='https://img.shields.io/badge/arXiv-Paper(soon)-red?style=flat&logo=arXiv&logoColor=red' alt='arxiv'>
|
| 12 |
+
</a>
|
| 13 |
+
<a href="http://120.76.142.206:8888">
|
| 14 |
+
<img src='https://img.shields.io/badge/Demo-Gradio-orange?style=flat&logo=Gradio&logoColor=red' alt='Demo'>
|
| 15 |
+
</a>
|
| 16 |
+
<a href='https://huggingface.co/zhengchong/CatVTON'>
|
| 17 |
+
<img src='https://img.shields.io/badge/Hugging Face-ckpts-orange?style=flat&logo=HuggingFace&logoColor=orange' alt='huggingface'>
|
| 18 |
+
</a>
|
| 19 |
+
<a href="https://github.com/Zheng-Chong/CatVTON">
|
| 20 |
+
<img src='https://img.shields.io/badge/GitHub-Repo-blue?style=flat&logo=GitHub' alt='GitHub'>
|
| 21 |
+
</a>
|
| 22 |
+
<a href="https://github.com/Zheng-Chong/CatVTON/LICENCE"><img src='https://img.shields.io/badge/License-CC BY--NC--SA--4.0-lightgreen?style=flat&logo=Lisence' alt='License'>
|
| 23 |
+
</a>
|
| 24 |
+
</p>
|
| 25 |
+
|
| 26 |
+
<div align="center">
|
| 27 |
+
<img src="resource/img/teaser.jpg" width="100%" height="100%"/>
|
| 28 |
+
</div>
|
| 29 |
+
|
| 30 |
+
<!-- This repository is the official implementation of ***CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffsuion Models***. -->
|
| 31 |
+
|
| 32 |
+
**CatVTON** is a simple and efficient virtual try-on diffusion model with ***1) Lightweight Network (899.06M parameters totally)***, ***2) Parameter-Efficient Training (49.57M parameters trainable)*** and ***3) Simplified Inference (< 8G VRAM for 1024X768 resolution)***.
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
## Updates
|
| 36 |
+
- **`2024/7/21`**: Our **Inference Code** and [**π€Weights**](https://huggingface.co/zhengchong/CatVTON) are released.
|
| 37 |
+
- **`2024/7/11`**: [**Online Demo**](http://120.76.142.206:8888) is released.
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
## Inference
|
| 42 |
+
### Data Preparation
|
| 43 |
+
Before inference, you need to download the [VITON-HD](https://github.com/shadow2496/VITON-HD) or [DressCode](https://github.com/aimagelab/dress-code) dataset.
|
| 44 |
+
Once the datasets are downloaded, the folder structures should look like these:
|
| 45 |
+
```
|
| 46 |
+
βββ VITON-HD
|
| 47 |
+
| βββ test_pairs_unpaired.txt
|
| 48 |
+
β βββ test
|
| 49 |
+
| | βββ image
|
| 50 |
+
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
|
| 51 |
+
β β βββ cloth
|
| 52 |
+
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
|
| 53 |
+
β β βββ agnostic-mask
|
| 54 |
+
β β β βββ [000006_00_mask.png | 000008_00.png | ...]
|
| 55 |
+
...
|
| 56 |
+
```
|
| 57 |
+
For DressCode dataset, we provide [our preprocessed agnostic masks](https://drive.google.com/drive/folders/1uT88nYQl0n5qHz6zngb9WxGlX4ArAbVX?usp=share_link), download and place in `agnostic_masks` folders under each category.
|
| 58 |
+
```
|
| 59 |
+
βββ DressCode
|
| 60 |
+
| βββ test_pairs_paired.txt
|
| 61 |
+
| βββ test_pairs_unpaired.txt
|
| 62 |
+
β βββ [dresses | lower_body | upper_body]
|
| 63 |
+
| | βββ test_pairs_paired.txt
|
| 64 |
+
| | βββ test_pairs_unpaired.txt
|
| 65 |
+
β β βββ images
|
| 66 |
+
β β β βββ [013563_0.jpg | 013563_1.jpg | 013564_0.jpg | 013564_1.jpg | ...]
|
| 67 |
+
β β βββ agnostic_masks
|
| 68 |
+
β β β βββ [013563_0.png| 013564_0.png | ...]
|
| 69 |
+
...
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
### Inference on VTIONHD/DressCode
|
| 73 |
+
To run the inference on the DressCode or VITON-HD dataset, run the following command, checkpoints will be automaticly download from HuggingFace.
|
| 74 |
+
|
| 75 |
+
```PowerShell
|
| 76 |
+
CUDA_VISIBLE_DEVICES=0 python inference.py \
|
| 77 |
+
--dataset [dresscode | vitonhd] \
|
| 78 |
+
--data_root_path <path> \
|
| 79 |
+
--output_dir <path>
|
| 80 |
+
--dataloader_num_workers 8 \
|
| 81 |
+
--batch_size 8 \
|
| 82 |
+
--seed 555 \
|
| 83 |
+
--mixed_precision [no | fp16 | bf16] \
|
| 84 |
+
--allow_tf32 \
|
| 85 |
+
--repaint \
|
| 86 |
+
--eval_pair
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
## Acknowledgement
|
| 91 |
+
Our code is modified based on [Diffusers](https://github.com/huggingface/diffusers). We use [SCHP](https://github.com/GoGoDuck912/Self-Correction-Human-Parsing/tree/master) and [DensePose](https://github.com/facebookresearch/DensePose) to automaticly generate mask in our [Gradio](https://github.com/gradio-app/gradio) App. Thanks to all the contributors!
|
| 92 |
+
<!-- ## Citation
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
```
|
| 96 |
+
``` -->
|