---
license: mit
library_name: pytorch
tags:
  - clip
  - siglip
  - zero-shot-image-classification
  - interpretability
  - concept-bottleneck
  - vision-language
  - cvpr2026
datasets:
  - oonat/ezpc-embeddings
base_model:
  - openai/clip-rn50
  - openai/clip-vit-base-patch32
  - openai/clip-vit-large-patch14
  - google/siglip-so400m-patch14-384
pipeline_tag: zero-shot-image-classification
---

# EZPC - Pre-trained Concept Projection Matrices

This repository hosts the trained projection matrices **A** for **Explaining CLIP Zero-shot Predictions Through Concepts** (CVPR 2026).

- 📄 **Paper:** [arXiv:2603.28211](https://arxiv.org/abs/2603.28211)
- 💻 **Code:** [github.com/oonat/ezpc](https://github.com/oonat/ezpc)
- 🌐 **Project page:** [oonat.github.io/ezpc](https://oonat.github.io/ezpc)
- 🤗 **Embeddings:** [oonat/ezpc-embeddings](https://huggingface.co/datasets/oonat/ezpc-embeddings)

## Repository Layout

Each checkpoint is a single PyTorch tensor file (`best_A.pth`) inside a folder whose name encodes the training configuration:

```
checkpoints/
└── <dataset>_backbone_<backbone>_weight_<λ>_epoch_<E>_lr_<lr>_bs_<bs>/
    └── best_A.pth
```

For example:

```
checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
```

The tensor `best_A.pth` has shape **(d, m)** where *d* is the backbone embedding dimension and *m* is the number of concepts in the dataset's concept vocabulary.

## Quickstart

### 1. Download a checkpoint

```bash
pip install huggingface-hub

# Download all checkpoints
hf download oonat/ezpc-checkpoints \
    --local-dir . \
    --include "checkpoints/*"

# Or just one
hf download oonat/ezpc-checkpoints \
    --local-dir . \
    --include "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/*"
```

### 2. Load and use it

The checkpoint can be loaded directly with PyTorch:

```python
import torch

A = torch.load(
    "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth",
    weights_only=True,
).float()
print(A.shape)  # (d, m)
```

To run zero-shot evaluation, qualitative concept visualizations, faithfulness
analyses, or any of the experiments from the paper, clone the
[EZPC GitHub repo](https://github.com/oonat/ezpc).

Evaluation also requires the pre-computed image **and** cached text embeddings.
Download them from the [embeddings dataset](https://huggingface.co/datasets/oonat/ezpc-embeddings)
into `./data`:

```bash
hf download oonat/ezpc-embeddings --repo-type dataset --local-dir data
```

Then point `--checkpoint_path` at the downloaded `best_A.pth`:

```bash
python test.py \
    --dataset CIFAR-100 \
    --dataset_root ./data \
    --backbone RN50 \
    --checkpoint_path ./checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
```

## Citation

```bibtex
@InProceedings{Ozdemir_2026_CVPR,
    author    = {Ozdemir, Onat and Christensen, Anders and Alaniz, Stephan and Akata, Zeynep and Akbas, Emre},
    title     = {Explaining CLIP Zero-shot Predictions Through Concepts},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
    pages     = {31336-31345}
}
```

## Acknowledgements

The concept vocabularies and class label mapping files were originally curated by the [Label-free Concept Bottleneck Models](https://github.com/Trustworthy-ML-Lab/Label-free-CBM) authors. We thank them for open-sourcing these resources.

## License

Released under the MIT License.

Note that these checkpoints were trained on embeddings derived from CIFAR-100, CUB-200-2011, Places365, ImageNet, and ImageNet-100. Users are responsible for complying with the original license and terms of use of those datasets, which may restrict commercial use — notably ImageNet and CUB-200-2011, which are released for non-commercial research only.