(ICASSP 2025) MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval
1School of Software, Shandong University2Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences)
3School of Computer Science and Technology, Shandong University
4School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)
βCorresponding author
This repository hosts the official pre-trained checkpoints for MEDIAN, a composed image retrieval framework that adaptively aggregates intermediate-grained features and performs target-guided semantic alignment to better compose reference images and modification texts.
π Model Information
1. Model Name
MEDIAN (Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval).
2. Task Type & Applicable Tasks
- Task Type: Composed Image Retrieval (CIR).
- Applicable Tasks: Retrieving a target image from a gallery based on a reference image together with a modification text.
3. Project Introduction
MEDIAN is designed to improve cross-modal composition in CIR by introducing adaptive intermediate-grained aggregation and target-guided semantic alignment. Instead of relying only on local and global granularity, it models local-intermediate-global feature composition to establish more precise correspondences between the reference image and the text query.
4. Training Data Source
According to the project README, MEDIAN is evaluated on three standard CIR datasets:
- CIRR
- FashionIQ
- Shoes
5. Hosted Weights
This repository currently includes the following checkpoint files:
CIRR.pthβ MEDIAN checkpoint for CIRRFashionIQ.ptβ MEDIAN checkpoint for FashionIQShoes.ptβ MEDIAN checkpoint for Shoes
π Usage & Basic Inference
These checkpoints are intended to be used with the official MEDIAN GitHub repository.
Step 1: Prepare the Environment
Set up the environment following the project README:
git clone https://github.com/iLearn-Lab/ICASSP25-MEDIAN
cd ICASSP25-MEDIAN
conda create -n pair python=3.8.10
conda activate pair
pip install torch==2.0.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
Step 2: Prepare Data and Weights
The original project README documents support for the following datasets:
CIRRFashionIQShoes
Place the corresponding checkpoint file in your preferred checkpoint directory and provide the dataset paths when training or evaluating.
Step 3: Training
The project README documents the following training command:
python3 train.py \
--model_dir ./checkpoints/MEDIAN \
--dataset {cirr,fashioniq,shoes} \
--cirr_path "" \
--fashioniq_path "" \
--shoes_path ""
Step 4: Testing / Evaluation
For CIRR test submission generation, the documented command is:
python src/cirr_test_submission.py model_path
Example checkpoint path:
model_path = /path/to/CIRR.pth
β οΈ Limitations & Notes
- These checkpoints are intended for academic research and for reproducing the MEDIAN results reported in the ICASSP 2025 paper.
- Dataset preparation is required before training or evaluation, and the supported datasets documented by the project are CIRR, FashionIQ, and Shoes.
- The usage commands above are adapted from the official project README. Please refer to the GitHub repository if you need the full training and evaluation workflow.
π Citation
If you find this work or these checkpoints useful in your research, please consider citing:
@inproceedings{MEDIAN,
title={MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval},
author={Huang, Qinlei and Chen, Zhiwei and Li, Zixu and Wang, Chunxiao and Song, Xuemeng and Hu, Yupeng and Nie, Liqiang},
booktitle={Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing},
pages={1--5},
year={2025},
organization={IEEE}
}