ezpc-checkpoints / README.md
oonat's picture
Update README.md
e99eb83 verified
|
Raw
History Blame Contribute Delete
3.96 kB
---
license: mit
library_name: pytorch
tags:
- clip
- siglip
- zero-shot-image-classification
- interpretability
- concept-bottleneck
- vision-language
- cvpr2026
datasets:
- oonat/ezpc-embeddings
base_model:
- openai/clip-rn50
- openai/clip-vit-base-patch32
- openai/clip-vit-large-patch14
- google/siglip-so400m-patch14-384
pipeline_tag: zero-shot-image-classification
---
# EZPC - Pre-trained Concept Projection Matrices
This repository hosts the trained projection matrices **A** for **Explaining CLIP Zero-shot Predictions Through Concepts** (CVPR 2026).
- πŸ“„ **Paper:** [arXiv:2603.28211](https://arxiv.org/abs/2603.28211)
- πŸ’» **Code:** [github.com/oonat/ezpc](https://github.com/oonat/ezpc)
- 🌐 **Project page:** [oonat.github.io/ezpc](https://oonat.github.io/ezpc)
- πŸ€— **Embeddings:** [oonat/ezpc-embeddings](https://huggingface.co/datasets/oonat/ezpc-embeddings)
## Repository Layout
Each checkpoint is a single PyTorch tensor file (`best_A.pth`) inside a folder whose name encodes the training configuration:
```
checkpoints/
└── <dataset>_backbone_<backbone>_weight_<Ξ»>_epoch_<E>_lr_<lr>_bs_<bs>/
└── best_A.pth
```
For example:
```
checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
```
The tensor `best_A.pth` has shape **(d, m)** where *d* is the backbone embedding dimension and *m* is the number of concepts in the dataset's concept vocabulary.
## Quickstart
### 1. Download a checkpoint
```bash
pip install huggingface-hub
# Download all checkpoints
hf download oonat/ezpc-checkpoints \
--local-dir . \
--include "checkpoints/*"
# Or just one
hf download oonat/ezpc-checkpoints \
--local-dir . \
--include "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/*"
```
### 2. Load and use it
The checkpoint can be loaded directly with PyTorch:
```python
import torch
A = torch.load(
"checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth",
weights_only=True,
).float()
print(A.shape) # (d, m)
```
To run zero-shot evaluation, qualitative concept visualizations, faithfulness
analyses, or any of the experiments from the paper, clone the
[EZPC GitHub repo](https://github.com/oonat/ezpc).
Evaluation also requires the pre-computed image **and** cached text embeddings.
Download them from the [embeddings dataset](https://huggingface.co/datasets/oonat/ezpc-embeddings)
into `./data`:
```bash
hf download oonat/ezpc-embeddings --repo-type dataset --local-dir data
```
Then point `--checkpoint_path` at the downloaded `best_A.pth`:
```bash
python test.py \
--dataset CIFAR-100 \
--dataset_root ./data \
--backbone RN50 \
--checkpoint_path ./checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
```
## Citation
```bibtex
@InProceedings{Ozdemir_2026_CVPR,
author = {Ozdemir, Onat and Christensen, Anders and Alaniz, Stephan and Akata, Zeynep and Akbas, Emre},
title = {Explaining CLIP Zero-shot Predictions Through Concepts},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2026},
pages = {31336-31345}
}
```
## Acknowledgements
The concept vocabularies and class label mapping files were originally curated by the [Label-free Concept Bottleneck Models](https://github.com/Trustworthy-ML-Lab/Label-free-CBM) authors. We thank them for open-sourcing these resources.
## License
Released under the MIT License.
Note that these checkpoints were trained on embeddings derived from CIFAR-100, CUB-200-2011, Places365, ImageNet, and ImageNet-100. Users are responsible for complying with the original license and terms of use of those datasets, which may restrict commercial use β€” notably ImageNet and CUB-200-2011, which are released for non-commercial research only.