Matthijs/snacks
Updated • 278 • 13
How to use matteopilotto/vit-base-patch16-224-in21k-snacks with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("image-classification", model="matteopilotto/vit-base-patch16-224-in21k-snacks")
pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png") # Load model directly
from transformers import AutoImageProcessor, AutoModelForImageClassification
processor = AutoImageProcessor.from_pretrained("matteopilotto/vit-base-patch16-224-in21k-snacks")
model = AutoModelForImageClassification.from_pretrained("matteopilotto/vit-base-patch16-224-in21k-snacks")Matthijs/snacks dataset
Vision Transformer (ViT) model pre-trained on ImageNet-21k and fine-tuned on Matthijs/snacks for 5 epochs using various data augmentation transformations from torchvision.
The model achieves a 94.97% and 94.43% accuracy on the validation and test set, respectively.
The code block below shows the various transformations applied during pre-processing to augment the original dataset.
The augmented images where generated on-the-fly with the set_transform method.
from transformers import ViTFeatureExtractor
from torchvision.transforms import (
Compose,
Normalize,
Resize,
RandomResizedCrop,
RandomHorizontalFlip,
RandomAdjustSharpness,
ToTensor
)
checkpoint = 'google/vit-base-patch16-224-in21k'
feature_extractor = ViTFeatureExtractor.from_pretrained(checkpoint)
# transformations on the training set
train_aug_transforms = Compose([
RandomResizedCrop(size=feature_extractor.size),
RandomHorizontalFlip(p=0.5),
RandomAdjustSharpness(sharpness_factor=5, p=0.5),
ToTensor(),
Normalize(mean=feature_extractor.image_mean, std=feature_extractor.image_std),
])
# transformations on the validation/test set
valid_aug_transforms = Compose([
Resize(size=(feature_extractor.size, feature_extractor.size)),
ToTensor(),
Normalize(mean=feature_extractor.image_mean, std=feature_extractor.image_std),
])