|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- vit |
|
|
- image-classification |
|
|
- beans |
|
|
- transfer-learning |
|
|
--- |
|
|
|
|
|
# ViT Beans Model |
|
|
|
|
|
This model was fine-tuned using transfer learning on the ["beans"](https://huggingface.co/datasets/beans) dataset from the Hugging Face Datasets Hub. |
|
|
It classifies bean plant leaves into the following categories: |
|
|
|
|
|
- `LABEL_0`: angular_leaf_spot |
|
|
- `LABEL_1`: bean_rust |
|
|
- `LABEL_2`: healthy |
|
|
|
|
|
## Model architecture |
|
|
|
|
|
The base model is `google/vit-base-patch16-224`. |
|
|
|
|
|
## Training |
|
|
|
|
|
Transfer learning was used with a ViT model pre-trained on ImageNet-21k. |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
This model was compared to a zero-shot classification using CLIP (`openai/clip-vit-base-patch32`). |
|
|
|
|
|
### Zero-Shot Results on Oxford Pets (as required): |
|
|
|
|
|
- **Accuracy**: 0.9993189573287964 |
|
|
- **Precision**: 0.5794700118713081 |
|
|
- **Recall**: 0.10156987264053896 |
|
|
- **Model used**: `openai/clip-vit-base-patch32` |
|
|
|
|
|
## Example |
|
|
|
|
|
```python |
|
|
from transformers import ViTFeatureExtractor, ViTForImageClassification |
|
|
from PIL import Image |
|
|
import torch |
|
|
|
|
|
image = Image.open("example_input.png") |
|
|
extractor = ViTFeatureExtractor.from_pretrained("LindiSimon/vit-beans-model") |
|
|
inputs = extractor(images=image, return_tensors="pt") |
|
|
model = ViTForImageClassification.from_pretrained("LindiSimon/vit-beans-model") |
|
|
with torch.no_grad(): |
|
|
logits = model(**inputs).logits |
|
|
predicted_class = logits.argmax(-1).item() |
|
|
|