(CVPR 2026) ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for CIR (Model Weights)

Zixu Li¹, Yupeng Hu^1✉, Zhiwei Chen¹, Mingyu Zhang¹, Zhiheng Fu¹, Liqiang Nie²

¹School of Software, Shandong University
²School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen),
^✉Corresponding author

This repository hosts the official pre-trained checkpoints for ConeSep, a robust noise-unlearning framework that leverages geometric boundary estimation and optimal transport to solve the Noisy Triplet Correspondence (NTC) problem in Composed Image Retrieval (CIR).

📌 Model Information

1. Model Name

ConeSep (Cone-based robust noisE-unlearning comPositional network) Checkpoints.

2. Task Type & Applicable Tasks

Task Type: Composed Image Retrieval (CIR).
Applicable Tasks: Retrieving target images based on a reference image and a modification text. These weights provide unmatched robustness under varying degrees of noisy training data (Noise Triplet Correspondence).

3. Project Introduction

Existing Composed Image Retrieval methods struggle with the "Noisy Triplet Correspondence (NTC)" problem, leading to Modality Suppression, Negative Anchor Deficiency, and Unlearning Backlash. ConeSep actively perceives, structurally models, and precisely "unlearns" noise through three core modules:

📐 Geometric Fidelity Quantization (GFQ): Estimates a noise boundary using cone space geometric separability to quantify sample fidelity.
🛑 Negative Boundary Learning (NBL): Learns a "diagonal negative combination" for each query as an explicit semantic opposite-anchor.
🎯 Boundary-based Targeted Unlearning (BTU): Models noisy correction as an Optimal Transport (OT) problem to execute precise unlearning without backlash on clean samples.

4. Training Data Source & Hosted Weights

The models were trained on the FashionIQ and CIRR datasets across different simulated noise ratios ($N \in {0.2, 0.5, 0.8}$). This Hugging Face repository provides the corresponding .pt checkpoint files organized by dataset and noise ratio:

📂 fashioniq/
- ConeSep-FIQ_N0.2.pt (Trained with 20% noise)
- ConeSep-FIQ_N0.5.pt (Trained with 50% noise)
- ConeSep-FIQ_N0.8.pt (Trained with 80% noise)
📂 cirr/
- ConeSep-CIRR_N0.2.pt (Trained with 20% noise)
- ConeSep-CIRR_N0.5.pt (Trained with 50% noise)
- ConeSep-CIRR_N0.8.pt (Trained with 80% noise)

🚀 Usage & Basic Inference

These weights are designed to be evaluated out-of-the-box using the official ConeSep GitHub repository.

Step 1: Prepare the Environment

Clone the GitHub repository and set up the environment:

git clone https://github.com/iLearn-Lab/CVPR26-ConeSep
cd ConeSep
conda create -n conesep python=3.8
conda activate conesep
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16

Step 2: Download Model Weights

Download the specific .pt files you need from this Hugging Face repository and place them into a checkpoints/ directory within your cloned repo. For example, to evaluate the CIRR model trained with 50% noise:

ConeSep/
└── checkpoints/
    └── cirr_noise0.5/
        └── best_model.pt  <-- (Rename the downloaded ConeSep-CIRR_N0.5.pt to best_model.pt)

Step 3: Run Testing / Evaluation

To generate prediction files on the CIRR dataset for the CIRR Evaluation Server, run:

# Example for testing the CIRR 50% noise model
python src/cirr_test_submission.py checkpoints/cirr_noise0.5/

(The script will automatically generate the required .json files based on the checkpoint for online evaluation.)

⚠️ Limitations & Notes

Hardware Requirements: ConeSep is built upon the BLIP-2 architecture. It is highly recommended to run inference and training on GPUs with sufficient memory (e.g., NVIDIA A40 48GB or V100 32GB).
Intended Use: These weights are intended for academic research, robustness evaluation, and reproducing the results reported in the CVPR 2026 paper.

📝⭐️ Citation

If you find our framework, code, or these weights useful in your research, please consider leaving a Star ⭐️ on our GitHub repository and citing our CVPR 2026 paper:

@InProceedings{ConeSep,
    title={ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval},
    author={Li, Zixu and Hu, Yupeng and Chen, Zhiwei and Zhang, Mingyu and Fu, Zhiheng and Nie, Liqiang},
    booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year = {2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support