whis-22/bankai-image
This is a fine-tuned Vision Transformer (ViT) model for food image classification.
Model Details
- Model type: Vision Transformer (ViT)
- License: MIT
- Finetuned from: google/vit-base-patch16-224
- Dataset: Food-101 (subset)
Intended Uses & Limitations
This model is intended for classifying food images into various food categories.
How to Use
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch
processor = AutoImageProcessor.from_pretrained("whis-22/bankai-image")
model = AutoModelForImageClassification.from_pretrained("whis-22/bankai-image")
# Load and preprocess the image
image = Image.open("path_to_your_image.jpg")
inputs = processor(images=image, return_tensors="pt")
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_idx = logits.argmax(-1).item()
# Get the predicted class
print(f"Predicted class: {model.config.id2label[predicted_class_idx]}")
- Downloads last month
- 4