☁️ Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

Zhiheng Fu1  Yupeng Hu1βœ‰  Qianyun Yang1  Shiqi Zhang1  Zhiwei Chen1  Zixu Li1

1School of Software, Shandong University

These are the official pre-trained model weights and configuration files for Air-Know, a robust framework designed for Composed Image Retrieval (CIR) under Noisy Correspondence Learning (NCL) settings.

πŸ”— Paper: [Accepted by CVPR 2026] πŸ”— GitHub Repository: ZhihFu/Air-Know πŸ”— Project Website: Air-Know Webpage


πŸ“Œ Model Information

1. Model Name

Air-Know (Arbiter-Calibrated Knowledge-Internalizing Robust Network) Checkpoints.

2. Task Type & Applicable Tasks

  • Task Type: Composed Image Retrieval (CIR) / Noisy Correspondence Learning / Vision-Language
  • Applicable Tasks: Robust multimodal retrieval that effectively mitigates the impact of Noisy Triplet Correspondence (NTC) in training data, while still maintaining highly competitive performance in traditional fully-supervised (0% noise) environments.

3. Project Introduction

Air-Know is built upon the BLIP-2/LAVIS framework and tackles the noisy correspondence problem in CIR through three primary modules:

  • βš–οΈ External Prior Arbitration: Leverages an offline multimodal expert to generate reliable arbitration priors, bypassing the often-unreliable "small-loss hypothesis".
  • 🧠 Expert-Knowledge Internalization: Transfers these priors into a lightweight proxy network to structurally prevent the memorization of ambiguous partial matches.
  • πŸ”„ Dual-Stream Reconciliation: Dynamically integrates the internalized knowledge to provide robust online feedback, guiding the final representation learning.

4. Training Data Source

The model was primarily trained and evaluated on standard CIR datasets under various simulated noise ratios (e.g., 0.0, 0.2, 0.5, 0.8):

  • FashionIQ (Fashion Domain)
  • CIRR (Open Domain)

πŸš€ Usage & Basic Inference

These weights are designed to be used directly with the official Air-Know GitHub repository.

Step 1: Prepare the Environment

Clone the GitHub repository and install dependencies (evaluated on Python 3.8.10 and PyTorch 2.1.0 with CUDA 12.1+):

git clone [https://github.com/ZhihFu/Air-Know](https://github.com/ZhihFu/Air-Know)
cd Air-Know
conda create -n airknow python=3.8 -y
conda activate airknow

# Install PyTorch
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)

# Install core dependencies
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16

Step 2: Download Model Weights & Data

Download the checkpoint folders (e.g., cirr_noise0.8 or fashioniq_noise0.8) from this Hugging Face repository and place them in your local checkpoints/ directory.

Ensure you also download and structure the base dataset images (CIRR and FashionIQ) as specified in the GitHub repo's Data Preparation section.

Step 3: Run Testing / Inference

To generate prediction files on the CIRR dataset for submission to the CIRR Evaluation Server using the downloaded checkpoint, run:

python src/cirr_test_submission.py checkpoints/cirr_noise0.8/

(The script will automatically output a .json file based on the best checkpoint in the specified folder).

To train the model under specific noise ratios (e.g., 0.8), you can run:

python train_BLIP2.py \
    --dataset cirr \
    --cirr_path "/path/to/CIRR/" \
    --model_dir "./checkpoints/cirr_noise0.8" \
    --noise_ratio 0.8 \
    --batch_size 256 \
    --num_epochs 20 \
    --lr 2e-5

⚠️ Limitations & Notes

Disclaimer: This framework and its pre-trained weights are strictly intended for academic research purposes.

  • The model requires access to the original source datasets (CIRR, FashionIQ) for full evaluation. Users must comply with the original licenses of those respective datasets.
  • The noise_ratio parameter is a simulated interference during training; performance in wild, unstructured noisy environments may vary.

πŸ“β­οΈ Citation

If you find our work or these model weights useful in your research, please consider leaving a Star ⭐️ on our GitHub repo and citing our paper:

@InProceedings{Air-Know,
    title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
    author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
    booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year = {2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support