|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- image-classification |
|
|
- vit |
|
|
- pytorch |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
datasets: |
|
|
- AI-Lab-Makerere/beans |
|
|
--- |
|
|
|
|
|
# Umsakwa/Uddayvit-image-classification-model |
|
|
|
|
|
This Vision Transformer (ViT) model has been fine-tuned for image classification tasks on the [Beans Dataset](https://huggingface.co/datasets/beans), which consists of images of beans categorized into three classes: |
|
|
|
|
|
- **Angular Leaf Spot** |
|
|
- **Bean Rust** |
|
|
- **Healthy** |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: Vision Transformer (ViT) |
|
|
- **Base Model**: `google/vit-base-patch16-224-in21k` |
|
|
- **Framework**: PyTorch |
|
|
- **Task**: Image Classification |
|
|
- **Labels**: 3 (angular_leaf_spot, bean_rust, healthy) |
|
|
- **Input Shape**: 224x224 RGB images |
|
|
- **Training Dataset**: [Beans Dataset](https://huggingface.co/datasets/beans) |
|
|
- **Fine-Tuning**: The model was fine-tuned on the Beans dataset to classify plant diseases in beans. |
|
|
|
|
|
### Model Description |
|
|
|
|
|
The model uses the ViT architecture, which processes image patches using a transformer-based approach. It has been trained to classify bean diseases with high accuracy. This makes it particularly useful for agricultural applications, such as early disease detection and plant health monitoring. |
|
|
|
|
|
- **Developed by**: Udday (Umsakwa) |
|
|
- **Language(s)**: N/A (Image-based) |
|
|
- **License**: Apache-2.0 |
|
|
- **Finetuned from**: `google/vit-base-patch16-224-in21k` |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository**: [Umsakwa/Uddayvit-image-classification-model](https://huggingface.co/Umsakwa/Uddayvit-image-classification-model) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model can be directly used for classifying bean leaf images into one of three categories: angular leaf spot, bean rust, or healthy. |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
The model may also be fine-tuned further for similar agricultural image classification tasks or integrated into larger plant health monitoring systems. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- The model is not suitable for non-agricultural image classification tasks without further fine-tuning. |
|
|
- Not robust to extreme distortions, occlusions, or very low-resolution images. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
- **Bias**: The dataset may contain biases due to specific environmental or geographic conditions of the sampled plants. |
|
|
- **Limitations**: Performance may degrade on datasets significantly different from the training dataset. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- Users should ensure the model is evaluated on their specific dataset before deployment. |
|
|
- Additional fine-tuning may be required for domain-specific applications. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
To use this model for inference: |
|
|
|
|
|
```python |
|
|
from transformers import ViTForImageClassification, ViTImageProcessor |
|
|
|
|
|
# Load model and processor |
|
|
model = ViTForImageClassification.from_pretrained("Umsakwa/Uddayvit-image-classification-model") |
|
|
processor = ViTImageProcessor.from_pretrained("Umsakwa/Uddayvit-image-classification-model") |
|
|
|
|
|
# Prepare an image |
|
|
image = processor(images="path_to_image.jpg", return_tensors="pt") |
|
|
|
|
|
# Run inference |
|
|
outputs = model(**image) |
|
|
predictions = outputs.logits.argmax(-1) |