Update README.md

c7edeec verified about 12 hours ago

5.58 kB

license: apache-2.0
task_categories:
  - image-retrieval
tags:
  - composed-image-retrieval
  - pytorch
  - icassp-2025

(ICASSP 2025) MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval

Qinlei Huang¹, Zhiwei Chen¹, Zixu Li¹, Chunxiao Wang², Xuemeng Song³, Yupeng Hu^1✉, Liqiang Nie⁴

¹School of Software, Shandong University
²Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences)
³School of Computer Science and Technology, Shandong University
⁴School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
^✉Corresponding author

This repository hosts the official pre-trained checkpoints for MEDIAN, a composed image retrieval framework that adaptively aggregates intermediate-grained features and performs target-guided semantic alignment to better compose reference images and modification texts.

📌 Model Information

1. Model Name

MEDIAN (Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval).

2. Task Type & Applicable Tasks

Task Type: Composed Image Retrieval (CIR).
Applicable Tasks: Retrieving a target image from a gallery based on a reference image together with a modification text.

3. Project Introduction

MEDIAN is designed to improve cross-modal composition in CIR by introducing adaptive intermediate-grained aggregation and target-guided semantic alignment. Instead of relying only on local and global granularity, it models local-intermediate-global feature composition to establish more precise correspondences between the reference image and the text query.

4. Training Data Source

According to the project README, MEDIAN is evaluated on three standard CIR datasets:

CIRR
FashionIQ
Shoes

5. Hosted Weights

This repository currently includes the following checkpoint files:

CIRR.pth — MEDIAN checkpoint for CIRR
FashionIQ.pt — MEDIAN checkpoint for FashionIQ
Shoes.pt — MEDIAN checkpoint for Shoes

🚀 Usage & Basic Inference

These checkpoints are intended to be used with the official MEDIAN GitHub repository.

Step 1: Prepare the Environment

Set up the environment following the project README:

git clone https://github.com/iLearn-Lab/ICASSP25-MEDIAN
cd ICASSP25-MEDIAN
conda create -n pair python=3.8.10
conda activate pair
pip install torch==2.0.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Step 2: Prepare Data and Weights

The original project README documents support for the following datasets:

CIRR
FashionIQ
Shoes

Place the corresponding checkpoint file in your preferred checkpoint directory and provide the dataset paths when training or evaluating.

Step 3: Training

The project README documents the following training command:

python3 train.py \
  --model_dir ./checkpoints/MEDIAN \
  --dataset {cirr,fashioniq,shoes} \
  --cirr_path "" \
  --fashioniq_path "" \
  --shoes_path ""

Step 4: Testing / Evaluation

For CIRR test submission generation, the documented command is:

python src/cirr_test_submission.py model_path

Example checkpoint path:

model_path = /path/to/CIRR.pth

⚠️ Limitations & Notes

These checkpoints are intended for academic research and for reproducing the MEDIAN results reported in the ICASSP 2025 paper.
Dataset preparation is required before training or evaluation, and the supported datasets documented by the project are CIRR, FashionIQ, and Shoes.
The usage commands above are adapted from the official project README. Please refer to the GitHub repository if you need the full training and evaluation workflow.

📝 Citation

If you find this work or these checkpoints useful in your research, please consider citing:

@inproceedings{MEDIAN,
  title={MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval},
  author={Huang, Qinlei and Chen, Zhiwei and Li, Zixu and Wang, Chunxiao and Song, Xuemeng and Hu, Yupeng and Nie, Liqiang},
  booktitle={Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing},
  pages={1--5},
  year={2025},
  organization={IEEE}
}