βοΈ Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval
Zhiheng Fu1 Yupeng Hu1β Qianyun Yang1 Shiqi Zhang1 Zhiwei Chen1 Zixu Li1
1School of Software, Shandong University
These are the official pre-trained model weights and configuration files for Air-Know, a robust framework designed for Composed Image Retrieval (CIR) under Noisy Correspondence Learning (NCL) settings.
π Paper: [Accepted by CVPR 2026] π GitHub Repository: ZhihFu/Air-Know π Project Website: Air-Know Webpage
π Model Information
1. Model Name
Air-Know (Arbiter-Calibrated Knowledge-Internalizing Robust Network) Checkpoints.
2. Task Type & Applicable Tasks
- Task Type: Composed Image Retrieval (CIR) / Noisy Correspondence Learning / Vision-Language
- Applicable Tasks: Robust multimodal retrieval that effectively mitigates the impact of Noisy Triplet Correspondence (NTC) in training data, while still maintaining highly competitive performance in traditional fully-supervised (0% noise) environments.
3. Project Introduction
Air-Know is built upon the BLIP-2/LAVIS framework and tackles the noisy correspondence problem in CIR through three primary modules:
- βοΈ External Prior Arbitration: Leverages an offline multimodal expert to generate reliable arbitration priors, bypassing the often-unreliable "small-loss hypothesis".
- π§ Expert-Knowledge Internalization: Transfers these priors into a lightweight proxy network to structurally prevent the memorization of ambiguous partial matches.
- π Dual-Stream Reconciliation: Dynamically integrates the internalized knowledge to provide robust online feedback, guiding the final representation learning.
4. Training Data Source
The model was primarily trained and evaluated on standard CIR datasets under various simulated noise ratios (e.g., 0.0, 0.2, 0.5, 0.8):
- FashionIQ (Fashion Domain)
- CIRR (Open Domain)
π Usage & Basic Inference
These weights are designed to be used directly with the official Air-Know GitHub repository.
Step 1: Prepare the Environment
Clone the GitHub repository and install dependencies (evaluated on Python 3.8.10 and PyTorch 2.1.0 with CUDA 12.1+):
git clone [https://github.com/ZhihFu/Air-Know](https://github.com/ZhihFu/Air-Know)
cd Air-Know
conda create -n airknow python=3.8 -y
conda activate airknow
# Install PyTorch
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
# Install core dependencies
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16
Step 2: Download Model Weights & Data
Download the checkpoint folders (e.g., cirr_noise0.8 or fashioniq_noise0.8) from this Hugging Face repository and place them in your local checkpoints/ directory.
Ensure you also download and structure the base dataset images (CIRR and FashionIQ) as specified in the GitHub repo's Data Preparation section.
Step 3: Run Testing / Inference
To generate prediction files on the CIRR dataset for submission to the CIRR Evaluation Server using the downloaded checkpoint, run:
python src/cirr_test_submission.py checkpoints/cirr_noise0.8/
(The script will automatically output a .json file based on the best checkpoint in the specified folder).
To train the model under specific noise ratios (e.g., 0.8), you can run:
python train_BLIP2.py \
--dataset cirr \
--cirr_path "/path/to/CIRR/" \
--model_dir "./checkpoints/cirr_noise0.8" \
--noise_ratio 0.8 \
--batch_size 256 \
--num_epochs 20 \
--lr 2e-5
β οΈ Limitations & Notes
Disclaimer: This framework and its pre-trained weights are strictly intended for academic research purposes.
- The model requires access to the original source datasets (CIRR, FashionIQ) for full evaluation. Users must comply with the original licenses of those respective datasets.
- The
noise_ratioparameter is a simulated interference during training; performance in wild, unstructured noisy environments may vary.
πβοΈ Citation
If you find our work or these model weights useful in your research, please consider leaving a Star βοΈ on our GitHub repo and citing our paper:
@InProceedings{Air-Know,
title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
year = {2026}
}