VITON-Extends / README.md
NguyenDinhHieu's picture
Update README.md
d335852 verified
---
license: mit
language:
- en
tags:
- pytorch
- virtual-try-on
- viton
- image-to-image
- fashion
pipeline_tag: image-to-image
datasets:
- NguyenDinhHieu/VITON-Extends-DB
---
# VITON-Extends — Model weights & inference bundle
**Enhancing Pose Adaptability in Virtual Try-On Systems**
| Author | Affiliation | ORCID |
|--------|-------------|--------|
| **Nguyen Dinh Hieu** | FPT University, Hanoi, Vietnam | [0009-0002-6683-8036](https://orcid.org/0009-0002-6683-8036) |
| **Tran Minh Khuong** | FPT University, Hanoi, Vietnam | — |
| **Phan Duy Hung** | FPT University, Hanoi, Vietnam | [0000-0002-6033-6484](https://orcid.org/0000-0002-6033-6484) |
**Contact:** [hieundhe180318@fpt.edu.vn](mailto:hieundhe180318@fpt.edu.vn), [khuongtmhe180089@fpt.edu.vn](mailto:khuongtmhe180089@fpt.edu.vn), [hungpd2@fe.edu.vn](mailto:hungpd2@fe.edu.vn)
---
## What is in this Hub repository?
This **model** repo ships two top-level folders:
| Folder on Hub | Role |
|----------------|------|
| **`VITON-Extends_test/`** | Code and assets to run **inference / demo** (`test.py`, networks, options, etc.). |
| **`VITON-Extends-Train/`** | **Training-side** bundle (scripts, configs, or checkpoints—whatever you packaged locally under that directory). |
Download the full repo (or a subset with `allow_patterns`) and point your local paths to these folders as in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends).
**Dataset (images)** is hosted separately: **[NguyenDinhHieu/VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB)** (see that dataset card for `Train.zip` / `Test.zip` and extraction).
---
## Abstract
Garment fitting in virtual try-on often fails under **complex poses**, **occlusions**, and **misalignment** between person and garment. VITON-Extends improves **pose adaptability** and **garment warping** with a **global appearance flow** model, **StyleGAN-style** global modulation, and a **local flow refinement** stage. On the VITON benchmark, results are strong especially in challenging poses.
**Paper:** [Springer LNCS (IUKM 2025), DOI 10.1007/978-981-96-4606-7_21](https://doi.org/10.1007/978-981-96-4606-7_21)
**Code:** [github.com/nguyendinhhieu1309/VITON-Extends](https://github.com/nguyendinhhieu1309/VITON-Extends)
---
## Quick download (Python)
```python
from huggingface_hub import snapshot_download
path = snapshot_download(
repo_id="NguyenDinhHieu/VITON-Extends",
local_dir="./VITON-Extends_hf",
)
# Then use ./VITON-Extends_hf/VITON-Extends_test/ and ./VITON-Extends_hf/VITON-Extends-Train/
```
---
## Environment (reference)
Versions below match the **paper / reference** setup; your local `VITON-Extends_*` trees may ship their own `requirements.txt`—prefer those for exact pins.
| Component | Reference version |
|-----------|-------------------|
| PyTorch | 2.2.1+cu118 (example) |
| TorchVision | 0.17.1+cu118 |
| CuPy | 13.3.0 |
| OpenCV | 4.10.0 |
| Python | 3.12 (or as in project env) |
---
## Training & testing (outline)
1. **Data:** Use **[VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB)** — unzip `Train.zip` / `Test.zip`, set `dataroot` to `train/` or `test/` as in the dataset card.
2. **Checkpoints:** Place warping / generation weights where the GitHub repo expects (e.g. under `checkpoints/VITON-Extends/`).
3. **Train:** Run the shell scripts under `scripts/` from the GitHub repository (parser-based then parser-free stages).
4. **Test:** From the downloaded **`VITON-Extends_test/`** tree, follow repo instructions, e.g.
`python test.py --name demo --resize_or_crop None --batchSize 1 --gpu_ids 0`
For **FID** and extra assets, see links in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends).
---
## Results (qualitative)
![VITON-Extends results](https://github.com/user-attachments/assets/b9a9b46f-753e-485b-b9ad-156b7b588324)
---
## Citation
```bibtex
@inproceedings{hieu2025vitonextends,
title = {Enhancing Pose Adaptability in Virtual Try-On Systems},
author = {Hieu, Nguyen Dinh and Khuong, Tran Minh and Hung, Phan Duy},
booktitle = {Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2025)},
series = {Lecture Notes in Computer Science},
volume = {15585},
publisher = {Springer},
address = {Singapore},
year = {2025},
doi = {10.1007/978-981-96-4606-7_21}
}
```
---
## Acknowledgements
Built on virtual try-on and flow-based clothed-person generation ideas; base code lineage includes **ClothFlow**. Full credits appear in the [GitHub repository](https://github.com/nguyendinhhieu1309/VITON-Extends).