---
license: apache-2.0
tags:
- vit
- image-classification
- beans
- transfer-learning
---

# ViT Beans Model

This model was fine-tuned using transfer learning on the ["beans"](https://huggingface.co/datasets/beans) dataset from the Hugging Face Datasets Hub.  
It classifies bean plant leaves into the following categories:

- `LABEL_0`: angular_leaf_spot  
- `LABEL_1`: bean_rust  
- `LABEL_2`: healthy

## Model architecture

The base model is `google/vit-base-patch16-224`.

## Training

Transfer learning was used with a ViT model pre-trained on ImageNet-21k.

## Evaluation

This model was compared to a zero-shot classification using CLIP (`openai/clip-vit-base-patch32`).

### Zero-Shot Results on Oxford Pets (as required):

- **Accuracy**: 0.9993189573287964
- **Precision**: 0.5794700118713081
- **Recall**: 0.10156987264053896
- **Model used**: `openai/clip-vit-base-patch32`

## Example

```python
from transformers import ViTFeatureExtractor, ViTForImageClassification
from PIL import Image
import torch

image = Image.open("example_input.png")
extractor = ViTFeatureExtractor.from_pretrained("LindiSimon/vit-beans-model")
inputs = extractor(images=image, return_tensors="pt")
model = ViTForImageClassification.from_pretrained("LindiSimon/vit-beans-model")
with torch.no_grad():
    logits = model(**inputs).logits
predicted_class = logits.argmax(-1).item()