FloraSense / README.md
Sisigoks's picture
Update README.md
c981222 verified
metadata
license: mit
datasets:
  - Sisigoks/Planter_GARDEN_EDITION
language:
  - en
metrics:
  - accuracy
base_model:
  - google/vit-base-patch16-224
pipeline_tag: image-classification
library_name: transformers
tags:
  - biology
  - plants
  - flora
  - 10K

🌿 Sisigoks/FloraSense

FloraSense is a fine-tuned Vision Transformer (ViT) model designed for accurate classification of plant species and flora-related imagery. It builds on top of the powerful google/vit-base-patch16-224 base model and is fine-tuned on the Planter_GARDEN_EDITION dataset curated by Sisigoks, which includes over 10,000 diverse plant images.


🧠 Model Description

  • Architecture: Vision Transformer (ViT)
  • Base Model: google/vit-base-patch16-224
  • Task: Image Classification
  • Use Case: Automated plant and flora species recognition in digital botany, garden classification systems, plant care apps, biodiversity projects, and educational tools.

πŸ“Š Model Performance

  • Evaluation Accuracy: 35.46%
  • Evaluation Loss: 4.2894
  • Epochs Trained: 10
  • Evaluation Speed:
    • 33.9 samples/sec
    • 2.12 steps/sec

⚠️ While the accuracy may appear moderate, the model is handling over 10,000 highly similar plant species, making this a non-trivial challenge in fine-grained classification.


πŸ§ͺ Training Procedure

Hyperparameter Value
Learning Rate 5e-5
Train Batch Size 16
Eval Batch Size 16
Gradient Accumulation 4
Total Effective Batch 64
Optimizer Adam (Ξ²1=0.9, Ξ²2=0.999)
Scheduler Linear w/ warmup (10%)
Epochs 15
Seed 42
  • Framework: PyTorch
  • Libraries: Transformers 4.45.1, Datasets 3.0.1, Tokenizers 0.20.0

πŸ“š Dataset

  • Name: Sisigoks/Planter_GARDEN_EDITION
  • Type: Image Classification
  • Language: English
  • Scope: Over 10,000 unique plant and floral species
  • Format: Real-world garden and nature photography
  • Use Case: Realistic and diverse training scenarios for classification models

βœ… Intended Use

Use Cases

  • Botanical image recognition apps
  • Educational tools for students and researchers
  • Smart gardening & plant care solutions
  • Field-use flora identification via AR and mobile apps

Target Users

  • Botanists
  • AI and ML researchers
  • Gardeners and farmers
  • Biology educators and students

⚠️ Limitations

  • May confuse visually similar species due to fine-grained class diversity.
  • Performance could degrade in poor lighting or occlusion-heavy environments.
  • Biases may exist based on the geographic scope of the dataset (e.g., underrepresentation of tropical or rare plants).

πŸ” Ethical Considerations

  • Accuracy: Misclassification of medicinal/toxic plants can have real-world safety implications.
  • Bias: Regional, lighting, or season-specific training data may skew predictions in certain environments.
  • Usage: This is a research-grade model and should not be relied on for critical decisions without expert validation.

πŸš€ How to Use

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
processor = AutoImageProcessor.from_pretrained("Sisigoks/FloraSense")
model = AutoModelForImageClassification.from_pretrained("Sisigoks/FloraSense")

# Load and preprocess image
image = Image.open("your_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_label = logits.argmax(-1).item()

print(f"Predicted class ID: {predicted_label}")

πŸ“„ Citation

If you use this model or dataset in your work, please cite:

  @misc{sisigoks_florasense_2025,
    author = {Sisigoks},
    title = {FloraSense: ViT-based Fine-Grained Plant Classifier},
    year = {2025},
    publisher = {Hugging Face},
    howpublished = {\url{https://huggingface.co/Sisigoks/FloraSense}}
  }

πŸ™Œ Acknowledgements

  • Hugging Face πŸ€— – for providing the model and dataset hosting infrastructure.
  • Google Research – for the original ViT architecture that enabled scalable vision transformers.