vit-mushroom-classifier

Fine-tuned google/vit-base-patch16-224 image classifier for 16 mushroom species.

This model was trained for an educational mushroom image classification project. It predicts the most likely species among the 16 trained classes and can be used for candidate ranking, demos, and model comparison. It is not suitable for real-world foraging, edibility, or toxicity decisions.

Model Details

Base model: google/vit-base-patch16-224
Architecture: ViTForImageClassification
Task: single-label image classification
Image size: 224 x 224
Classes: 16 mushroom species
Final model directory: models/vit_mushroom_classifier
Final evaluation date: 2026-06-03
Transformers version in config: 4.57.6

Classes

Amanita_muscaria
Amanita_phalloides
Armillaria_mellea
Cerioporus_squamosus
Chlorophyllum_brunneum
Clitocybe_nuda
Coprinellus_micaceus
Coprinus_comatus
Flammulina_velutipes
Gliophorus_psittacinus
Hygrophoropsis_aurantiaca
Hypholoma_lateritium
Stereum_hirsutum
Suillus_luteus
Tricholomopsis_rutilans
Tylopilus_felleus

Dataset

The image dataset combines a Kaggle mushroom image dataset with additional GBIF image records for the selected species.

Final image scope:

Total images: 11,200
Classes: 16
Images per class: 700
Train images: 7,839
Validation images: 1,680
Test images: 1,681
Split: stratified 70% / 15% / 15%

Training Procedure

The original ImageNet classifier head was replaced with a new 16-class head. Training used two phases:

Train the new classifier head.
Fine-tune the last 4 ViT encoder blocks.

Main configuration:

Batch size: 16
Head training epochs: 1
Fine-tuning epochs: 4
Head learning rate: 3e-4
Fine-tuning learning rate: 2e-5
Weight decay: 0.01
Random crop scale: 0.85-1.0
Color jitter: 0.1
Best model metric: macro F1
Early stopping: enabled, patience 2

Validation improved during fine-tuning:

Fine-tune epoch	Validation loss	Accuracy	Macro F1	Top-3 accuracy
1	0.3553	0.8994	0.8986	0.9762
2	0.2701	0.9196	0.9195	0.9804
3	0.2437	0.9250	0.9251	0.9833
4	0.2325	0.9304	0.9303	0.9845

Test Results

Final test set results:

Metric	Score
Accuracy	0.9161
Balanced accuracy	0.9161
Macro F1	0.9162
Top-3 accuracy	0.9780

Bootstrap 95% confidence intervals:

Metric	Mean	95% CI
Accuracy	0.9165	0.9036-0.9286
Macro F1	0.9162	0.9035-0.9282
Top-3 accuracy	0.9780	0.9714-0.9845

Lowest per-class F1 scores:

Class	Precision	Recall	F1
Flammulina_velutipes	0.8302	0.8381	0.8341
Armillaria_mellea	0.8762	0.8762	0.8762
Suillus_luteus	0.9020	0.8762	0.8889
Gliophorus_psittacinus	0.8649	0.9143	0.8889
Hygrophoropsis_aurantiaca	0.8727	0.9143	0.8930

Common confusion patterns included:

Suillus_luteus predicted as Tylopilus_felleus
Coprinus_comatus predicted as Chlorophyllum_brunneum
Several confusions involving Flammulina_velutipes, Gliophorus_psittacinus, and Hygrophoropsis_aurantiaca

Known Limitations

The model only predicts among the 16 trained species.
It does not know whether an image belongs to an unseen species.
It may make confident mistakes on visually similar species.
Some exact duplicate image hashes were found across splits: 55 duplicate rows in 27 hash groups, with 19 groups crossing train/validation/test splits. This may slightly overestimate test performance.
The dataset combines images from different sources, so source-specific visual patterns may influence predictions.
The model must not be used for mushroom consumption, toxicity, or foraging decisions.

Intended Use

Suitable uses:

Educational image classification demos
Candidate species ranking among the 16 known classes
Comparison against structured ecological/context models
Prototype user interfaces for mushroom image classification

Unsuitable uses:

Real-world mushroom identification without expert review
Edibility or toxicity classification
Safety-critical decisions
Open-world species recognition beyond the 16 labels

Summary

The final ViT model achieved strong and stable test performance: about 91.6% macro F1 and 97.8% top-3 accuracy on a balanced 16-class test set. The result is promising for educational candidate prediction, but a deduplicated re-split and re-evaluation would be recommended before treating the score as final.

Downloads last month: 4

Safetensors

Model size

85.8M params

Tensor type

F32

Model tree for gaglileoo/vit-mushroom-classifier

Base model

google/vit-base-patch16-224

Finetuned

(2062)

this model

gaglileoo
/

vit-mushroom-classifier