Instructions to use fillo-rinaldi/ViT-B-32-openai with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- OpenCLIP
How to use fillo-rinaldi/ViT-B-32-openai with OpenCLIP:
import open_clip model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:fillo-rinaldi/ViT-B-32-openai') tokenizer = open_clip.get_tokenizer('hf-hub:fillo-rinaldi/ViT-B-32-openai') - Notebooks
- Google Colab
- Kaggle
Fine-Tuned OpenCLIP ViT-B-32 Checkpoints (OpenAI)
This repository contains full fine-tuned OpenCLIP ViT-B-32 checkpoints, each
fine-tuned on a single downstream vision dataset starting from the OpenAI pretrained
weights. All models were trained by freezing the text encoder and fine-tuning only the visual backbone. These checkpoints are used as inputs for model merging and rebasin experiments
in the Merge-and-Rebase project.
Contents
20 vision datasets, each with two checkpoints:
| Dataset | Epochs |
|---|---|
| SUN397 | 14 |
| Cars | 35 |
| RESISC45 | 15 |
| EuroSAT | 12 |
| SVHN | 4 |
| GTSRB | 11 |
| MNIST | 5 |
| DTD | 76 |
| CIFAR100 | 6 |
| STL10 | 6 |
| Flowers102 | 147 |
| OxfordIIITPet | 82 |
| PCAM | 1 |
| FER2013 | 10 |
| EMNIST | 2 |
| CIFAR10 | 6 |
| Food101 | 4 |
| FashionMNIST | 5 |
| RenderedSST2 | 39 |
| KMNIST | 5 |
Each dataset folder contains:
full_best_ep.ptโ checkpoint with the best validation accuracyfull_last_ep.ptโ checkpoint from the final training epoch
Hyperparameters
All models were trained with the following shared configuration:
| Hyperparameter | Value |
|---|---|
| Training strategy | Full fine-tuning (all parameters) |
| Fine-tuning scope | Visual backbone only (text encoder frozen) |
| Optimizer | AdamW |
| Learning rate | 1e-5 |
| Weight decay | 0.1 |
| Batch size | 128 |
| LR scheduler | Cosine (decay to 0) |
| Gradient clip norm | 1.0 |
| Early stopping | Disabled |
| Seed | 42 |
| Precision | fp32 |
| Validation split | 10% of training data |
Backbone Details
| Property | Value |
|---|---|
| Model | OpenCLIP ViT-B-32 |
| Pretrained weights | openai |
| Embedding dimension | 512 (512) |
| Number of parameters | ~151M |
Source Code
These checkpoints were produced by merge_and_rebase/finetune/train_vision.py.
The exact training configuration is in finetune/configs/vision.yaml.
Usage
from huggingface_hub import hf_hub_download
import open_clip
checkpoint = hf_hub_download(
repo_id="fillo-rinaldi/ViT-B-32-openai",
filename="SUN397/full_best_ep.pt",
repo_type="model",
)
model, _, preprocess = open_clip.create_model_and_transforms(
"ViT-B-32",
pretrained="openai",
checkpoint_path=checkpoint,
)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support