WindLikeo commited on
Commit
c7edeec
·
verified ·
1 Parent(s): 0eeafc1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -3
README.md CHANGED
@@ -1,3 +1,140 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ task_categories:
4
+ - image-retrieval
5
+ tags:
6
+ - composed-image-retrieval
7
+ - pytorch
8
+ - icassp-2025
9
+ ---
10
+
11
+ <div align="center">
12
+ <h1>(ICASSP 2025) MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval</h1>
13
+ <div>
14
+ <a target="_blank" href="https://windlikeo.github.io/HQL.github.io/">Qinlei Huang</a><sup>1</sup>,
15
+ <a target="_blank" href="https://zivchen-ty.github.io">Zhiwei Chen</a><sup>1</sup>,
16
+ <a target="_blank" href="https://lee-zixu.github.io">Zixu Li</a><sup>1</sup>,
17
+ Chunxiao Wang<sup>2</sup>,
18
+ <a target="_blank" href="https://xuemengsong.github.io">Xuemeng Song</a><sup>3</sup>,
19
+ <a target="_blank" href="https://faculty.sdu.edu.cn/huyupeng1/zh_CN/index.htm">Yupeng Hu</a><sup>1&#9993</sup>,
20
+ <a target="_blank" href="https://liqiangnie.github.io/index.html">Liqiang Nie</a><sup>4</sup>
21
+ </div>
22
+ <sup>1</sup>School of Software, Shandong University<br>
23
+ <sup>2</sup>Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences)<br>
24
+ <sup>3</sup>School of Computer Science and Technology, Shandong University<br>
25
+ <sup>4</sup>School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)<br>
26
+ <sup>&#9993;</sup>Corresponding author
27
+ <br/>
28
+ <p>
29
+ <a href="https://ieeexplore.ieee.org/document/10890642"><img alt="Paper" src="https://img.shields.io/badge/Paper-IEEE-green.svg?style=flat-square"></a>
30
+ <a href="https://windlikeo.github.io/MEDIAN.github.io/"><img alt="Project Page" src="https://img.shields.io/badge/Website-orange"></a>
31
+ <a href="https://github.com/iLearn-Lab/ICASSP25-MEDIAN"><img alt="GitHub" src="https://img.shields.io/badge/GitHub-Repository-black?style=flat-square&logo=github"></a>
32
+ </p>
33
+ </div>
34
+
35
+ This repository hosts the official pre-trained checkpoints for **MEDIAN**, a composed image retrieval framework that adaptively aggregates intermediate-grained features and performs target-guided semantic alignment to better compose reference images and modification texts.
36
+
37
+ ---
38
+
39
+ ## 📌 Model Information
40
+
41
+ ### 1. Model Name
42
+ **MEDIAN** (Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval).
43
+
44
+ ### 2. Task Type & Applicable Tasks
45
+ - **Task Type:** Composed Image Retrieval (CIR).
46
+ - **Applicable Tasks:** Retrieving a target image from a gallery based on a reference image together with a modification text.
47
+
48
+ ### 3. Project Introduction
49
+ MEDIAN is designed to improve cross-modal composition in CIR by introducing adaptive intermediate-grained aggregation and target-guided semantic alignment. Instead of relying only on local and global granularity, it models **local-intermediate-global** feature composition to establish more precise correspondences between the reference image and the text query.
50
+
51
+ ### 4. Training Data Source
52
+ According to the project README, MEDIAN is evaluated on three standard CIR datasets:
53
+
54
+ - **CIRR**
55
+ - **FashionIQ**
56
+ - **Shoes**
57
+
58
+ ### 5. Hosted Weights
59
+ This repository currently includes the following checkpoint files:
60
+
61
+ - `CIRR.pth` — MEDIAN checkpoint for CIRR
62
+ - `FashionIQ.pt` — MEDIAN checkpoint for FashionIQ
63
+ - `Shoes.pt` — MEDIAN checkpoint for Shoes
64
+
65
+ ---
66
+
67
+ ## 🚀 Usage & Basic Inference
68
+
69
+ These checkpoints are intended to be used with the official [MEDIAN GitHub repository](https://github.com/iLearn-Lab/ICASSP25-MEDIAN).
70
+
71
+ ### Step 1: Prepare the Environment
72
+ Set up the environment following the project README:
73
+
74
+ ```bash
75
+ git clone https://github.com/iLearn-Lab/ICASSP25-MEDIAN
76
+ cd ICASSP25-MEDIAN
77
+ conda create -n pair python=3.8.10
78
+ conda activate pair
79
+ pip install torch==2.0.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
80
+ pip install -r requirements.txt
81
+ ```
82
+
83
+ ### Step 2: Prepare Data and Weights
84
+ The original project README documents support for the following datasets:
85
+
86
+ - `CIRR`
87
+ - `FashionIQ`
88
+ - `Shoes`
89
+
90
+ Place the corresponding checkpoint file in your preferred checkpoint directory and provide the dataset paths when training or evaluating.
91
+
92
+ ### Step 3: Training
93
+ The project README documents the following training command:
94
+
95
+ ```bash
96
+ python3 train.py \
97
+ --model_dir ./checkpoints/MEDIAN \
98
+ --dataset {cirr,fashioniq,shoes} \
99
+ --cirr_path "" \
100
+ --fashioniq_path "" \
101
+ --shoes_path ""
102
+ ```
103
+
104
+ ### Step 4: Testing / Evaluation
105
+ For CIRR test submission generation, the documented command is:
106
+
107
+ ```bash
108
+ python src/cirr_test_submission.py model_path
109
+ ```
110
+
111
+ Example checkpoint path:
112
+
113
+ ```text
114
+ model_path = /path/to/CIRR.pth
115
+ ```
116
+
117
+ ---
118
+
119
+ ## ⚠️ Limitations & Notes
120
+
121
+ - These checkpoints are intended for **academic research** and for reproducing the MEDIAN results reported in the ICASSP 2025 paper.
122
+ - Dataset preparation is required before training or evaluation, and the supported datasets documented by the project are **CIRR**, **FashionIQ**, and **Shoes**.
123
+ - The usage commands above are adapted from the official project README. Please refer to the GitHub repository if you need the full training and evaluation workflow.
124
+
125
+ ---
126
+
127
+ ## 📝 Citation
128
+
129
+ If you find this work or these checkpoints useful in your research, please consider citing:
130
+
131
+ ```bibtex
132
+ @inproceedings{MEDIAN,
133
+ title={MEDIAN: Adaptive Intermediate-grained Aggregation Network for Composed Image Retrieval},
134
+ author={Huang, Qinlei and Chen, Zhiwei and Li, Zixu and Wang, Chunxiao and Song, Xuemeng and Hu, Yupeng and Nie, Liqiang},
135
+ booktitle={Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing},
136
+ pages={1--5},
137
+ year={2025},
138
+ organization={IEEE}
139
+ }
140
+ ```