Update README.md

d335852 verified about 2 months ago

4.68 kB

	---
	license: mit
	language:
	- en
	tags:
	- pytorch
	- virtual-try-on
	- viton
	- image-to-image
	- fashion
	pipeline_tag: image-to-image
	datasets:
	- NguyenDinhHieu/VITON-Extends-DB
	---

	# VITON-Extends — Model weights & inference bundle

	Enhancing Pose Adaptability in Virtual Try-On Systems

	\| Author \| Affiliation \| ORCID \|
	\|--------\|-------------\|--------\|
	\| Nguyen Dinh Hieu \| FPT University, Hanoi, Vietnam \| [0009-0002-6683-8036](https://orcid.org/0009-0002-6683-8036) \|
	\| Tran Minh Khuong \| FPT University, Hanoi, Vietnam \| — \|
	\| Phan Duy Hung \| FPT University, Hanoi, Vietnam \| [0000-0002-6033-6484](https://orcid.org/0000-0002-6033-6484) \|

	Contact: [hieundhe180318@fpt.edu.vn](mailto:hieundhe180318@fpt.edu.vn), [khuongtmhe180089@fpt.edu.vn](mailto:khuongtmhe180089@fpt.edu.vn), [hungpd2@fe.edu.vn](mailto:hungpd2@fe.edu.vn)

	---

	## What is in this Hub repository?

	This model repo ships two top-level folders:

	\| Folder on Hub \| Role \|
	\|----------------\|------\|
	\| `VITON-Extends_test/` \| Code and assets to run inference / demo (`test.py`, networks, options, etc.). \|
	\| `VITON-Extends-Train/` \| Training-side bundle (scripts, configs, or checkpoints—whatever you packaged locally under that directory). \|

	Download the full repo (or a subset with `allow_patterns`) and point your local paths to these folders as in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends).

	Dataset (images) is hosted separately: [NguyenDinhHieu/VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB) (see that dataset card for `Train.zip` / `Test.zip` and extraction).

	---

	## Abstract

	Garment fitting in virtual try-on often fails under complex poses, occlusions, and misalignment between person and garment. VITON-Extends improves pose adaptability and garment warping with a global appearance flow model, StyleGAN-style global modulation, and a local flow refinement stage. On the VITON benchmark, results are strong especially in challenging poses.

	Paper: [Springer LNCS (IUKM 2025), DOI 10.1007/978-981-96-4606-7_21](https://doi.org/10.1007/978-981-96-4606-7_21)
	Code: [github.com/nguyendinhhieu1309/VITON-Extends](https://github.com/nguyendinhhieu1309/VITON-Extends)

	---

	## Quick download (Python)

	```python
	from huggingface_hub import snapshot_download

	path = snapshot_download(
	repo_id="NguyenDinhHieu/VITON-Extends",
	local_dir="./VITON-Extends_hf",
	)
	# Then use ./VITON-Extends_hf/VITON-Extends_test/ and ./VITON-Extends_hf/VITON-Extends-Train/
	```

	---

	## Environment (reference)

	Versions below match the paper / reference setup; your local `VITON-Extends_*` trees may ship their own `requirements.txt`—prefer those for exact pins.

	\| Component \| Reference version \|
	\|-----------\|-------------------\|
	\| PyTorch \| 2.2.1+cu118 (example) \|
	\| TorchVision \| 0.17.1+cu118 \|
	\| CuPy \| 13.3.0 \|
	\| OpenCV \| 4.10.0 \|
	\| Python \| 3.12 (or as in project env) \|

	---

	## Training & testing (outline)

	1. Data: Use [VITON-Extends-DB](https://huggingface.co/datasets/NguyenDinhHieu/VITON-Extends-DB) — unzip `Train.zip` / `Test.zip`, set `dataroot` to `train/` or `test/` as in the dataset card.
	2. Checkpoints: Place warping / generation weights where the GitHub repo expects (e.g. under `checkpoints/VITON-Extends/`).
	3. Train: Run the shell scripts under `scripts/` from the GitHub repository (parser-based then parser-free stages).
	4. Test: From the downloaded `VITON-Extends_test/` tree, follow repo instructions, e.g.
	`python test.py --name demo --resize_or_crop None --batchSize 1 --gpu_ids 0`

	For FID and extra assets, see links in the [GitHub README](https://github.com/nguyendinhhieu1309/VITON-Extends).

	---

	## Results (qualitative)

	![VITON-Extends results](https://github.com/user-attachments/assets/b9a9b46f-753e-485b-b9ad-156b7b588324)

	---

	## Citation

	```bibtex
	@inproceedings{hieu2025vitonextends,
	title = {Enhancing Pose Adaptability in Virtual Try-On Systems},
	author = {Hieu, Nguyen Dinh and Khuong, Tran Minh and Hung, Phan Duy},
	booktitle = {Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2025)},
	series = {Lecture Notes in Computer Science},
	volume = {15585},
	publisher = {Springer},
	address = {Singapore},
	year = {2025},
	doi = {10.1007/978-981-96-4606-7_21}
	}
	```

	---

	## Acknowledgements

	Built on virtual try-on and flow-based clothed-person generation ideas; base code lineage includes ClothFlow. Full credits appear in the [GitHub repository](https://github.com/nguyendinhhieu1309/VITON-Extends).