(CVPR 2026) ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for CIR (Model Weights)
1School of Software, Shandong University2School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen),
β Corresponding author
This repository hosts the official pre-trained checkpoints for ConeSep, a robust noise-unlearning framework that leverages geometric boundary estimation and optimal transport to solve the Noisy Triplet Correspondence (NTC) problem in Composed Image Retrieval (CIR).
π Model Information
1. Model Name
ConeSep (Cone-based robust noisE-unlearning comPositional network) Checkpoints.
2. Task Type & Applicable Tasks
- Task Type: Composed Image Retrieval (CIR).
- Applicable Tasks: Retrieving target images based on a reference image and a modification text. These weights provide unmatched robustness under varying degrees of noisy training data (Noise Triplet Correspondence).
3. Project Introduction
Existing Composed Image Retrieval methods struggle with the "Noisy Triplet Correspondence (NTC)" problem, leading to Modality Suppression, Negative Anchor Deficiency, and Unlearning Backlash. ConeSep actively perceives, structurally models, and precisely "unlearns" noise through three core modules:
- π Geometric Fidelity Quantization (GFQ): Estimates a noise boundary using cone space geometric separability to quantify sample fidelity.
- π Negative Boundary Learning (NBL): Learns a "diagonal negative combination" for each query as an explicit semantic opposite-anchor.
- π― Boundary-based Targeted Unlearning (BTU): Models noisy correction as an Optimal Transport (OT) problem to execute precise unlearning without backlash on clean samples.
4. Training Data Source & Hosted Weights
The models were trained on the FashionIQ and CIRR datasets across different simulated noise ratios ($N \in {0.2, 0.5, 0.8}$). This Hugging Face repository provides the corresponding .pt checkpoint files organized by dataset and noise ratio:
- π
fashioniq/ConeSep-FIQ_N0.2.pt(Trained with 20% noise)ConeSep-FIQ_N0.5.pt(Trained with 50% noise)ConeSep-FIQ_N0.8.pt(Trained with 80% noise)
- π
cirr/ConeSep-CIRR_N0.2.pt(Trained with 20% noise)ConeSep-CIRR_N0.5.pt(Trained with 50% noise)ConeSep-CIRR_N0.8.pt(Trained with 80% noise)
π Usage & Basic Inference
These weights are designed to be evaluated out-of-the-box using the official ConeSep GitHub repository.
Step 1: Prepare the Environment
Clone the GitHub repository and set up the environment:
git clone https://github.com/iLearn-Lab/CVPR26-ConeSep
cd ConeSep
conda create -n conesep python=3.8
conda activate conesep
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16
Step 2: Download Model Weights
Download the specific .pt files you need from this Hugging Face repository and place them into a checkpoints/ directory within your cloned repo. For example, to evaluate the CIRR model trained with 50% noise:
ConeSep/
βββ checkpoints/
βββ cirr_noise0.5/
βββ best_model.pt <-- (Rename the downloaded ConeSep-CIRR_N0.5.pt to best_model.pt)
Step 3: Run Testing / Evaluation
To generate prediction files on the CIRR dataset for the CIRR Evaluation Server, run:
# Example for testing the CIRR 50% noise model
python src/cirr_test_submission.py checkpoints/cirr_noise0.5/
(The script will automatically generate the required .json files based on the checkpoint for online evaluation.)
β οΈ Limitations & Notes
- Hardware Requirements: ConeSep is built upon the BLIP-2 architecture. It is highly recommended to run inference and training on GPUs with sufficient memory (e.g., NVIDIA A40 48GB or V100 32GB).
- Intended Use: These weights are intended for academic research, robustness evaluation, and reproducing the results reported in the CVPR 2026 paper.
πβοΈ Citation
If you find our framework, code, or these weights useful in your research, please consider leaving a Star βοΈ on our GitHub repository and citing our CVPR 2026 paper:
@InProceedings{ConeSep,
title={ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval},
author={Li, Zixu and Hu, Yupeng and Chen, Zhiwei and Zhang, Mingyu and Fu, Zhiheng and Nie, Liqiang},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
year = {2026}
}