| # OpenCLIP | |
| This is a fork of <a href="https://github.com/mlfoundations/open_clip">OpenCLIP</a> used to fine-tune CLIP models with PinPoint counterfactuals. Refer to the original repository for more details on open_clip. | |
| ### Installation | |
| ``` | |
| pip install open_clip_torch | |
| ``` | |
| ### Pretrained models | |
| For LAION-pretrained models, download and place them in the ./pretrained_models (this can be done with open_clip CLI interface)/ | |
| ### Sample single-process running code: | |
| To finetune CLIP models on CC3M: | |
| ```bash | |
| python -m open_clip_train.main \ | |
| --save-frequency 1 \ | |
| --zeroshot-frequency 1 \ | |
| --report-to tensorboard \ | |
| --train-data="..path_to_image_list.csv" \ | |
| --csv-img-key="Image_ID" \ | |
| --csv-caption-key="Caption" \ | |
| --val-data="/path/to/validation_data.csv" \ | |
| --imagenet-val="/path/to/imagenet/root/val/" \ | |
| --warmup 10000 \ | |
| --batch-size=128 \ | |
| --accum_freq=10 \ | |
| --lr=5e-06 \ | |
| --wd=0.1 \ | |
| --epochs=410 \ | |
| --workers=8 \ | |
| --pretrained_model="pretrained_models/vit_b16_laion2b.pth" \ | |
| --model ViT-B-16 | |
| ``` | |
| Note: `imagenet-val` is the path to the *validation* set of ImageNet for zero-shot evaluation, not the training set! | |
| You can remove this argument if you do not want to perform zero-shot evaluation on ImageNet throughout training. Note that the `val` folder should contain subfolders. If it does not, please use [this script](https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh). | |
| Note: the `train_data` should point to a *.csv file that contains the filelist with generated images in the following format: | |
| `ÌMAGE_ID IMAGE_CAPTION`, separated by '\t'. You can find the lists for our in-painted data under `./annotations`. | |