| base_model: | |
| - laion/CLIP-ViT-bigG-14-laion2B-39B-b160k | |
| datasets: | |
| - ILSVRC/imagenet-1k | |
| - mlfoundations/datacomp_small | |
| license: mit | |
| pipeline_tag: feature-extraction | |
| library_name: transformers | |
| [[Paper]](https://www.arxiv.org/abs/2506.03355) [[Code]](https://github.com/LIONS-EPFL/LEAF) | |
| Model Initialized from `laion/CLIP-ViT-bigG-14-laion2B-39B-b160k`. The text encoder is finetuned with LEAF at $k=1$ with $\rho=50$ and semantic constraints. | |
| To load this model use: | |
| ```python | |
| from transformers import CLIPProcessor, CLIPModel | |
| model_name = "LEAF-CLIP/OpenCLIP-ViT-bigG-rho50-k1-constrained" | |
| processor_name = "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k" | |
| model = CLIPModel.from_pretrained(model_name) | |
| processor = CLIPProcessor.from_pretrained(processor_name) | |
| ``` |