Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models
Paper
•
2512.15372
•
Published
This repository provides the ICC (Image Complexity Classifier) weights used in "Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models".
Paper: https://arxiv.org/abs/2512.15372 Code: https://github.com/MikelWL/ICAR
The ICC is a ConvNeXt V2 classifier fine-tuned from an ImageNet-22K pretrained checkpoint. It is used to route images between early-exit and full-path inference in ICAR.
This repository ships a single file at the repo root:
icc.ptPoint ICAR to the ICC weights with --icc-checkpoint:
python scripts/evaluate_mixed_preprocessed.py \
--config icar/configs/coco.yaml \
--checkpoint checkpoints/icar_coco/layer_12/latest_checkpoint.pt \
--base-dataset mscoco \
--base-data-root /path/to/coco-images \
--laion-data-root /path/to/laion_coco_100k \
--complexity-scores /path/to/laion_coco_100k_metadata/complexity_scores.json \
--early-exit-layer 12 \
--use-icc-routing \
--icc-checkpoint /path/to/icc.pt
The ICC weights are derived from ImageNet-pretrained ConvNeXt V2 models, which are licensed under CC-BY-NC. Please use these weights for non-commercial research purposes and provide appropriate attribution.