Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

This is the official implementation of the LOTS adapter from the paper "Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation", as an extension of our prior work at ICCV25.

To access the Sketchy dataset, refer to the HuggingFace repository

Road Map

Code release
Weights release
Platform release

Repository Structure

ckpts folder

Contains the pre-trained weights of the LOTS adapter.

scripts folder

Contains all the scripts for training and inference with LOTS on Sketchy.

src folder

Contains all the source code for the classes, models, and dataloaders used in the scripts.

Installation

Clone the repository

git clone https://huggingface.co/zyyyy/lots-extension
cd lots

We advise creating a Conda environment as follows

conda create -n lots python=3.12
conda activate lots
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install -e .

Training

We provide the script to train LOTS on our Sketchy dataset in scripts/lots/train_lots.py. For an example of usage, check run_train.sh, which contains the default parameters used in our experiments.

Inference

You can test our pre-trained model with the inference script in scripts/lots/inference_lots.py. For an example, check run_inference.sh. This script generates an image for each item in the test split of Sketchy, and saves them in a structured folder, with each item identified by its unique ID.

Citation

If you find our work useful, please cite our work:

@article{liu2026multi,
  title={Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation},
  author={Liu, Ziyue and Talon, Davide and Girella, Federico and Ruan, Zanxi and Mondo, Mattia and Bazzani, Loris and Wang, Yiming and Cristani, Marco},
  journal={arXiv preprint arXiv:2602.18309},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for zyyyy/lots-extension

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Paper • 2602.18309 • Published 21 days ago