Diffusers
Safetensors
zhengchong commited on
Commit
a2794fc
Β·
verified Β·
1 Parent(s): 2c508e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -3
README.md CHANGED
@@ -1,3 +1,96 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ ---
4
+
5
+
6
+
7
+ # <center> 🐈 CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
8
+
9
+ <p align="center">
10
+ <a href="https://github.com/Zheng-Chong/CatVTON">
11
+ <img src='https://img.shields.io/badge/arXiv-Paper(soon)-red?style=flat&logo=arXiv&logoColor=red' alt='arxiv'>
12
+ </a>
13
+ <a href="http://120.76.142.206:8888">
14
+ <img src='https://img.shields.io/badge/Demo-Gradio-orange?style=flat&logo=Gradio&logoColor=red' alt='Demo'>
15
+ </a>
16
+ <a href='https://huggingface.co/zhengchong/CatVTON'>
17
+ <img src='https://img.shields.io/badge/Hugging Face-ckpts-orange?style=flat&logo=HuggingFace&logoColor=orange' alt='huggingface'>
18
+ </a>
19
+ <a href="https://github.com/Zheng-Chong/CatVTON">
20
+ <img src='https://img.shields.io/badge/GitHub-Repo-blue?style=flat&logo=GitHub' alt='GitHub'>
21
+ </a>
22
+ <a href="https://github.com/Zheng-Chong/CatVTON/LICENCE"><img src='https://img.shields.io/badge/License-CC BY--NC--SA--4.0-lightgreen?style=flat&logo=Lisence' alt='License'>
23
+ </a>
24
+ </p>
25
+
26
+ <div align="center">
27
+ <img src="resource/img/teaser.jpg" width="100%" height="100%"/>
28
+ </div>
29
+
30
+ <!-- This repository is the official implementation of ***CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffsuion Models***. -->
31
+
32
+ **CatVTON** is a simple and efficient virtual try-on diffusion model with ***1) Lightweight Network (899.06M parameters totally)***, ***2) Parameter-Efficient Training (49.57M parameters trainable)*** and ***3) Simplified Inference (< 8G VRAM for 1024X768 resolution)***.
33
+
34
+
35
+ ## Updates
36
+ - **`2024/7/21`**: Our **Inference Code** and [**πŸ€—Weights**](https://huggingface.co/zhengchong/CatVTON) are released.
37
+ - **`2024/7/11`**: [**Online Demo**](http://120.76.142.206:8888) is released.
38
+
39
+
40
+
41
+ ## Inference
42
+ ### Data Preparation
43
+ Before inference, you need to download the [VITON-HD](https://github.com/shadow2496/VITON-HD) or [DressCode](https://github.com/aimagelab/dress-code) dataset.
44
+ Once the datasets are downloaded, the folder structures should look like these:
45
+ ```
46
+ β”œβ”€β”€ VITON-HD
47
+ | β”œβ”€β”€ test_pairs_unpaired.txt
48
+ β”‚ β”œβ”€β”€ test
49
+ | | β”œβ”€β”€ image
50
+ β”‚ β”‚ β”‚ β”œβ”€β”€ [000006_00.jpg | 000008_00.jpg | ...]
51
+ β”‚ β”‚ β”œβ”€β”€ cloth
52
+ β”‚ β”‚ β”‚ β”œβ”€β”€ [000006_00.jpg | 000008_00.jpg | ...]
53
+ β”‚ β”‚ β”œβ”€β”€ agnostic-mask
54
+ β”‚ β”‚ β”‚ β”œβ”€β”€ [000006_00_mask.png | 000008_00.png | ...]
55
+ ...
56
+ ```
57
+ For DressCode dataset, we provide [our preprocessed agnostic masks](https://drive.google.com/drive/folders/1uT88nYQl0n5qHz6zngb9WxGlX4ArAbVX?usp=share_link), download and place in `agnostic_masks` folders under each category.
58
+ ```
59
+ β”œβ”€β”€ DressCode
60
+ | β”œβ”€β”€ test_pairs_paired.txt
61
+ | β”œβ”€β”€ test_pairs_unpaired.txt
62
+ β”‚ β”œβ”€β”€ [dresses | lower_body | upper_body]
63
+ | | β”œβ”€β”€ test_pairs_paired.txt
64
+ | | β”œβ”€β”€ test_pairs_unpaired.txt
65
+ β”‚ β”‚ β”œβ”€β”€ images
66
+ β”‚ β”‚ β”‚ β”œβ”€β”€ [013563_0.jpg | 013563_1.jpg | 013564_0.jpg | 013564_1.jpg | ...]
67
+ β”‚ β”‚ β”œβ”€β”€ agnostic_masks
68
+ β”‚ β”‚ β”‚ β”œβ”€β”€ [013563_0.png| 013564_0.png | ...]
69
+ ...
70
+ ```
71
+
72
+ ### Inference on VTIONHD/DressCode
73
+ To run the inference on the DressCode or VITON-HD dataset, run the following command, checkpoints will be automaticly download from HuggingFace.
74
+
75
+ ```PowerShell
76
+ CUDA_VISIBLE_DEVICES=0 python inference.py \
77
+ --dataset [dresscode | vitonhd] \
78
+ --data_root_path <path> \
79
+ --output_dir <path>
80
+ --dataloader_num_workers 8 \
81
+ --batch_size 8 \
82
+ --seed 555 \
83
+ --mixed_precision [no | fp16 | bf16] \
84
+ --allow_tf32 \
85
+ --repaint \
86
+ --eval_pair
87
+ ```
88
+
89
+
90
+ ## Acknowledgement
91
+ Our code is modified based on [Diffusers](https://github.com/huggingface/diffusers). We use [SCHP](https://github.com/GoGoDuck912/Self-Correction-Human-Parsing/tree/master) and [DensePose](https://github.com/facebookresearch/DensePose) to automaticly generate mask in our [Gradio](https://github.com/gradio-app/gradio) App. Thanks to all the contributors!
92
+ <!-- ## Citation
93
+
94
+
95
+ ```
96
+ ``` -->