Update README.md

e99eb83 verified 26 days ago

3.96 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- clip
	- siglip
	- zero-shot-image-classification
	- interpretability
	- concept-bottleneck
	- vision-language
	- cvpr2026
	datasets:
	- oonat/ezpc-embeddings
	base_model:
	- openai/clip-rn50
	- openai/clip-vit-base-patch32
	- openai/clip-vit-large-patch14
	- google/siglip-so400m-patch14-384
	pipeline_tag: zero-shot-image-classification
	---

	# EZPC - Pre-trained Concept Projection Matrices

	This repository hosts the trained projection matrices A for Explaining CLIP Zero-shot Predictions Through Concepts (CVPR 2026).

	- 📄 Paper: [arXiv:2603.28211](https://arxiv.org/abs/2603.28211)
	- 💻 Code: [github.com/oonat/ezpc](https://github.com/oonat/ezpc)
	- 🌐 Project page: [oonat.github.io/ezpc](https://oonat.github.io/ezpc)
	- 🤗 Embeddings: [oonat/ezpc-embeddings](https://huggingface.co/datasets/oonat/ezpc-embeddings)

	## Repository Layout

	Each checkpoint is a single PyTorch tensor file (`best_A.pth`) inside a folder whose name encodes the training configuration:

	```
	checkpoints/
	└── <dataset>_backbone_<backbone>_weight_<λ>_epoch_<E>_lr_<lr>_bs_<bs>/
	└── best_A.pth
	```

	For example:

	```
	checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
	```

	The tensor `best_A.pth` has shape (d, m) where d is the backbone embedding dimension and m is the number of concepts in the dataset's concept vocabulary.

	## Quickstart

	### 1. Download a checkpoint

	```bash
	pip install huggingface-hub

	# Download all checkpoints
	hf download oonat/ezpc-checkpoints \
	--local-dir . \
	--include "checkpoints/*"

	# Or just one
	hf download oonat/ezpc-checkpoints \
	--local-dir . \
	--include "checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/*"
	```

	### 2. Load and use it

	The checkpoint can be loaded directly with PyTorch:

	```python
	import torch

	A = torch.load(
	"checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth",
	weights_only=True,
	).float()
	print(A.shape) # (d, m)
	```

	To run zero-shot evaluation, qualitative concept visualizations, faithfulness
	analyses, or any of the experiments from the paper, clone the
	[EZPC GitHub repo](https://github.com/oonat/ezpc).

	Evaluation also requires the pre-computed image and cached text embeddings.
	Download them from the [embeddings dataset](https://huggingface.co/datasets/oonat/ezpc-embeddings)
	into `./data`:

	```bash
	hf download oonat/ezpc-embeddings --repo-type dataset --local-dir data
	```

	Then point `--checkpoint_path` at the downloaded `best_A.pth`:

	```bash
	python test.py \
	--dataset CIFAR-100 \
	--dataset_root ./data \
	--backbone RN50 \
	--checkpoint_path ./checkpoints/CIFAR-100_backbone_RN50_weight_1.0_epoch_10000_lr_0.01_bs_1000000/best_A.pth
	```

	## Citation

	```bibtex
	@InProceedings{Ozdemir_2026_CVPR,
	author = {Ozdemir, Onat and Christensen, Anders and Alaniz, Stephan and Akata, Zeynep and Akbas, Emre},
	title = {Explaining CLIP Zero-shot Predictions Through Concepts},
	booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
	month = {June},
	year = {2026},
	pages = {31336-31345}
	}
	```

	## Acknowledgements

	The concept vocabularies and class label mapping files were originally curated by the [Label-free Concept Bottleneck Models](https://github.com/Trustworthy-ML-Lab/Label-free-CBM) authors. We thank them for open-sourcing these resources.

	## License

	Released under the MIT License.

	Note that these checkpoints were trained on embeddings derived from CIFAR-100, CUB-200-2011, Places365, ImageNet, and ImageNet-100. Users are responsible for complying with the original license and terms of use of those datasets, which may restrict commercial use — notably ImageNet and CUB-200-2011, which are released for non-commercial research only.