| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - pytorch |
| - virtual-try-on |
| - viton |
| - image-to-image |
| - fashion |
| pipeline_tag: image-to-image |
| datasets: |
| - NguyenDinhHieu/VITON-Extends-DB |
| --- |
| |
| # VITON-Extends — Model weights & inference bundle |
|
|
| **Enhancing Pose Adaptability in Virtual Try-On Systems** |
|
|
| | Author | Affiliation | ORCID | |
| |--------|-------------|--------| |
| | **Nguyen Dinh Hieu** | FPT University, Hanoi, Vietnam | [0009-0002-6683-8036](https://orcid.org/0009-0002-6683-8036) | |
| | **Tran Minh Khuong** | FPT University, Hanoi, Vietnam | — | |
| | **Phan Duy Hung** | FPT University, Hanoi, Vietnam | [0000-0002-6033-6484](https://orcid.org/0000-0002-6033-6484) | |
|
|
| **Contact:** [hieundhe180318@fpt.edu.vn](mailto:hieundhe180318@fpt.edu.vn), [khuongtmhe180089@fpt.edu.vn](mailto:khuongtmhe180089@fpt.edu.vn), [hungpd2@fe.edu.vn](mailto:hungpd2@fe.edu.vn) |
|
|
| --- |
|
|
| ## What is in this Hub repository? |
|
|
| This **model** repo ships two top-level folders: |
|
|
| | Folder on Hub | Role | |
| |----------------|------| |
| | **`VITON-Extends_test/`** | Code and assets to run **inference / demo** (`test.py`, networks, options, etc.). | |
| | **`VITON-Extends-Train/`** | **Training-side** bundle (scripts, configs, or checkpoints—whatever you packaged locally under that directory). | |
| |
| Download the full repo (or a subset with `allow_patterns`) and point your local paths to these folders as in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends). |
| |
| **Dataset (images)** is hosted separately: **[NguyenDinhHieu/VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB)** (see that dataset card for `Train.zip` / `Test.zip` and extraction). |
|
|
| --- |
|
|
| ## Abstract |
|
|
| Garment fitting in virtual try-on often fails under **complex poses**, **occlusions**, and **misalignment** between person and garment. VITON-Extends improves **pose adaptability** and **garment warping** with a **global appearance flow** model, **StyleGAN-style** global modulation, and a **local flow refinement** stage. On the VITON benchmark, results are strong especially in challenging poses. |
|
|
| **Paper:** [Springer LNCS (IUKM 2025), DOI 10.1007/978-981-96-4606-7_21](https://doi.org/10.1007/978-981-96-4606-7_21) |
| **Code:** [github.com/nguyendinhhieu1309/VITON-Extends](https://github.com/nguyendinhhieu1309/VITON-Extends) |
|
|
| --- |
|
|
| ## Quick download (Python) |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| |
| path = snapshot_download( |
| repo_id="NguyenDinhHieu/VITON-Extends", |
| local_dir="./VITON-Extends_hf", |
| ) |
| # Then use ./VITON-Extends_hf/VITON-Extends_test/ and ./VITON-Extends_hf/VITON-Extends-Train/ |
| ``` |
|
|
| --- |
|
|
| ## Environment (reference) |
|
|
| Versions below match the **paper / reference** setup; your local `VITON-Extends_*` trees may ship their own `requirements.txt`—prefer those for exact pins. |
|
|
| | Component | Reference version | |
| |-----------|-------------------| |
| | PyTorch | 2.2.1+cu118 (example) | |
| | TorchVision | 0.17.1+cu118 | |
| | CuPy | 13.3.0 | |
| | OpenCV | 4.10.0 | |
| | Python | 3.12 (or as in project env) | |
|
|
| --- |
|
|
| ## Training & testing (outline) |
|
|
| 1. **Data:** Use **[VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB)** — unzip `Train.zip` / `Test.zip`, set `dataroot` to `train/` or `test/` as in the dataset card. |
| 2. **Checkpoints:** Place warping / generation weights where the GitHub repo expects (e.g. under `checkpoints/VITON-Extends/`). |
| 3. **Train:** Run the shell scripts under `scripts/` from the GitHub repository (parser-based then parser-free stages). |
| 4. **Test:** From the downloaded **`VITON-Extends_test/`** tree, follow repo instructions, e.g. |
| `python test.py --name demo --resize_or_crop None --batchSize 1 --gpu_ids 0` |
| |
| For **FID** and extra assets, see links in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends). |
|
|
| --- |
|
|
| ## Results (qualitative) |
|
|
|  |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{hieu2025vitonextends, |
| title = {Enhancing Pose Adaptability in Virtual Try-On Systems}, |
| author = {Hieu, Nguyen Dinh and Khuong, Tran Minh and Hung, Phan Duy}, |
| booktitle = {Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2025)}, |
| series = {Lecture Notes in Computer Science}, |
| volume = {15585}, |
| publisher = {Springer}, |
| address = {Singapore}, |
| year = {2025}, |
| doi = {10.1007/978-981-96-4606-7_21} |
| } |
| ``` |
|
|
| --- |
|
|
| ## Acknowledgements |
|
|
| Built on virtual try-on and flow-based clothed-person generation ideas; base code lineage includes **ClothFlow**. Full credits appear in the [GitHub repository](https://github.com/nguyendinhhieu1309/VITON-Extends). |