| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - clip |
| - siglip |
| - zero-shot-image-classification |
| - interpretability |
| - concept-bottleneck |
| - vision-language |
| - cvpr2026 |
| datasets: |
| - oonat/ezpc-embeddings |
| base_model: |
| - openai/clip-rn50 |
| - openai/clip-vit-base-patch32 |
| - openai/clip-vit-large-patch14 |
| - google/siglip-so400m-patch14-384 |
| pipeline_tag: zero-shot-image-classification |
| --- |
| |
| # EZPC - Pre-trained Concept Projection Matrices |
|
|
| This repository hosts the trained projection matrices **A** for **Explaining CLIP Zero-shot Predictions Through Concepts** (CVPR 2026). |
|
|
| - π **Paper:** [arXiv:2603.28211](https://arxiv.org/abs/2603.28211) |
| - π» **Code:** [github.com/oonat/ezpc](https://github.com/oonat/ezpc) |
| - π **Project page:** [oonat.github.io/ezpc](https://oonat.github.io/ezpc) |
| - π€ **Embeddings:** [oonat/ezpc-embeddings](https://huggingface.co/datasets/oonat/ezpc-embeddings) |
|
|
| ## Repository Layout |
|
|
| Each checkpoint is a single PyTorch tensor file (`best_A.pth`) inside a folder whose name encodes the training configuration: |
|
|
| ``` |
| checkpoints/ |
| βββ <dataset>_backbone_<backbone>_weight_<Ξ»>_epoch_<E>_lr_<lr>_bs_<bs>/ |
| βββ best_A.pth |
| ``` |
|
|
| For example: |
|
|
| ``` |
| checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth |
| ``` |
|
|
| The tensor `best_A.pth` has shape **(d, m)** where *d* is the backbone embedding dimension and *m* is the number of concepts in the dataset's concept vocabulary. |
|
|
| ## Quickstart |
|
|
| ### 1. Download a checkpoint |
|
|
| ```bash |
| pip install huggingface-hub |
| |
| # Download all checkpoints |
| hf download oonat/ezpc-checkpoints \ |
| --local-dir . \ |
| --include "checkpoints/*" |
| |
| # Or just one |
| hf download oonat/ezpc-checkpoints \ |
| --local-dir . \ |
| --include "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/*" |
| ``` |
|
|
| ### 2. Load and use it |
|
|
| The checkpoint can be loaded directly with PyTorch: |
|
|
| ```python |
| import torch |
| |
| A = torch.load( |
| "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth", |
| weights_only=True, |
| ).float() |
| print(A.shape) # (d, m) |
| ``` |
|
|
| To run zero-shot evaluation, qualitative concept visualizations, faithfulness |
| analyses, or any of the experiments from the paper, clone the |
| [EZPC GitHub repo](https://github.com/oonat/ezpc). |
|
|
| Evaluation also requires the pre-computed image **and** cached text embeddings. |
| Download them from the [embeddings dataset](https://huggingface.co/datasets/oonat/ezpc-embeddings) |
| into `./data`: |
|
|
| ```bash |
| hf download oonat/ezpc-embeddings --repo-type dataset --local-dir data |
| ``` |
|
|
| Then point `--checkpoint_path` at the downloaded `best_A.pth`: |
|
|
| ```bash |
| python test.py \ |
| --dataset CIFAR-100 \ |
| --dataset_root ./data \ |
| --backbone RN50 \ |
| --checkpoint_path ./checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @InProceedings{Ozdemir_2026_CVPR, |
| author = {Ozdemir, Onat and Christensen, Anders and Alaniz, Stephan and Akata, Zeynep and Akbas, Emre}, |
| title = {Explaining CLIP Zero-shot Predictions Through Concepts}, |
| booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
| month = {June}, |
| year = {2026}, |
| pages = {31336-31345} |
| } |
| ``` |
|
|
| ## Acknowledgements |
|
|
| The concept vocabularies and class label mapping files were originally curated by the [Label-free Concept Bottleneck Models](https://github.com/Trustworthy-ML-Lab/Label-free-CBM) authors. We thank them for open-sourcing these resources. |
|
|
| ## License |
|
|
| Released under the MIT License. |
|
|
| Note that these checkpoints were trained on embeddings derived from CIFAR-100, CUB-200-2011, Places365, ImageNet, and ImageNet-100. Users are responsible for complying with the original license and terms of use of those datasets, which may restrict commercial use β notably ImageNet and CUB-200-2011, which are released for non-commercial research only. |
|
|
|
|