Update model card with complete documentation
Browse files
README.md
CHANGED
|
@@ -1,199 +1,222 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
-
#
|
| 7 |
|
| 8 |
-
|
| 9 |
-
A model to detect the presence of phenological stages (flowers and/or fruit) in citizen science photos of plants.
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
## Model Details
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
###
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
-
- **Funded by:** National Science Foundation (NSF)
|
| 24 |
-
- **Model type:** Masked Autoencoder with Image Classifier
|
| 25 |
-
- **License:** ??
|
| 26 |
-
- **Finetuned from model:** https://github.com/xml94/PlantCLEF2022
|
| 27 |
|
| 28 |
-
|
| 29 |
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
-
-
|
| 34 |
-
-
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
##
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
#
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
-
|
|
|
|
|
|
|
| 43 |
|
| 44 |
-
|
|
|
|
|
|
|
| 45 |
|
| 46 |
-
|
|
|
|
| 47 |
|
| 48 |
-
|
|
|
|
|
|
|
| 49 |
|
| 50 |
-
|
| 51 |
|
| 52 |
-
|
| 53 |
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
-
|
|
|
|
|
|
|
| 57 |
|
| 58 |
-
##
|
| 59 |
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
-
|
| 63 |
|
| 64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
-
|
| 73 |
|
| 74 |
-
|
|
|
|
|
|
|
| 75 |
|
| 76 |
-
##
|
| 77 |
|
| 78 |
-
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
###
|
|
|
|
|
|
|
|
|
|
| 89 |
|
| 90 |
-
|
| 91 |
|
|
|
|
| 92 |
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
<!-- This should link to a Dataset Card if possible. -->
|
| 112 |
-
|
| 113 |
-
[More Information Needed]
|
| 114 |
-
|
| 115 |
-
#### Factors
|
| 116 |
-
|
| 117 |
-
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
| 118 |
-
|
| 119 |
-
[More Information Needed]
|
| 120 |
-
|
| 121 |
-
#### Metrics
|
| 122 |
-
|
| 123 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
| 124 |
-
|
| 125 |
-
[More Information Needed]
|
| 126 |
-
|
| 127 |
-
### Results
|
| 128 |
-
|
| 129 |
-
[More Information Needed]
|
| 130 |
-
|
| 131 |
-
#### Summary
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
## Model Examination [optional]
|
| 136 |
-
|
| 137 |
-
<!-- Relevant interpretability work for the model goes here -->
|
| 138 |
-
|
| 139 |
-
[More Information Needed]
|
| 140 |
-
|
| 141 |
-
## Environmental Impact
|
| 142 |
-
|
| 143 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
| 144 |
-
|
| 145 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in Lacoste et al. (2019).
|
| 146 |
-
|
| 147 |
-
- **Hardware Type:** [More Information Needed]
|
| 148 |
-
- **Hours used:** [More Information Needed]
|
| 149 |
-
- **Cloud Provider:** [More Information Needed]
|
| 150 |
-
- **Compute Region:** [More Information Needed]
|
| 151 |
-
- **Carbon Emitted:** [More Information Needed]
|
| 152 |
-
|
| 153 |
-
## Technical Specifications [optional]
|
| 154 |
-
|
| 155 |
-
### Model Architecture and Objective
|
| 156 |
-
|
| 157 |
-
[More Information Needed]
|
| 158 |
-
|
| 159 |
-
### Compute Infrastructure
|
| 160 |
-
|
| 161 |
-
[More Information Needed]
|
| 162 |
-
|
| 163 |
-
#### Hardware
|
| 164 |
-
|
| 165 |
-
[More Information Needed]
|
| 166 |
-
|
| 167 |
-
#### Software
|
| 168 |
-
|
| 169 |
-
[More Information Needed]
|
| 170 |
-
|
| 171 |
-
## Citation [optional]
|
| 172 |
-
|
| 173 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
| 174 |
-
|
| 175 |
-
**BibTeX:**
|
| 176 |
-
|
| 177 |
-
[More Information Needed]
|
| 178 |
-
|
| 179 |
-
**APA:**
|
| 180 |
-
|
| 181 |
-
[More Information Needed]
|
| 182 |
-
|
| 183 |
-
## Glossary [optional]
|
| 184 |
-
|
| 185 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
| 186 |
-
|
| 187 |
-
[More Information Needed]
|
| 188 |
-
|
| 189 |
-
## More Information [optional]
|
| 190 |
-
|
| 191 |
-
[More Information Needed]
|
| 192 |
-
|
| 193 |
-
## Model Card Authors [optional]
|
| 194 |
-
|
| 195 |
-
[More Information Needed]
|
| 196 |
-
|
| 197 |
-
## Model Card Contact
|
| 198 |
-
|
| 199 |
-
[More Information Needed]
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
pipeline_tag: image-classification
|
| 4 |
+
license: mit
|
| 5 |
+
tags:
|
| 6 |
+
- vision
|
| 7 |
+
- image-classification
|
| 8 |
+
- biology
|
| 9 |
+
- ecology
|
| 10 |
+
- phenology
|
| 11 |
+
- plants
|
| 12 |
+
- vit
|
| 13 |
+
- plant-phenology
|
| 14 |
+
- iNaturalist
|
| 15 |
+
datasets:
|
| 16 |
+
- iNaturalist
|
| 17 |
+
metrics:
|
| 18 |
+
- accuracy
|
| 19 |
+
- f1
|
| 20 |
+
language:
|
| 21 |
+
- en
|
| 22 |
+
model-index:
|
| 23 |
+
- name: PhenoVision
|
| 24 |
+
results:
|
| 25 |
+
- task:
|
| 26 |
+
type: image-classification
|
| 27 |
+
name: Plant Reproductive Structure Detection
|
| 28 |
+
metrics:
|
| 29 |
+
- type: accuracy
|
| 30 |
+
value: 98.02
|
| 31 |
+
name: Flower Accuracy (buffer-filtered)
|
| 32 |
+
- type: accuracy
|
| 33 |
+
value: 97.01
|
| 34 |
+
name: Fruit Accuracy (buffer-filtered)
|
| 35 |
---
|
| 36 |
|
| 37 |
+
# PhenoVision: Automated Plant Reproductive Phenology from Field Images
|
| 38 |
|
| 39 |
+
PhenoVision is a Vision Transformer (ViT-Large) model fine-tuned to detect **flowers** and **fruits** in plant photographs. It was trained on 1.5 million human-annotated iNaturalist images and has been used to generate over 30 million new phenology records across 119,000+ plant species, vastly expanding global coverage of plant reproductive phenology data.
|
|
|
|
| 40 |
|
| 41 |
+
| | Flower | Fruit |
|
| 42 |
+
|---|---|---|
|
| 43 |
+
| **Accuracy** | 98.0% | 97.0% |
|
| 44 |
+
| **Sensitivity** | 98.5% | 84.2% |
|
| 45 |
+
| **Specificity** | 97.2% | 99.4% |
|
| 46 |
+
| **Expert validation** | 98.6% | 90.4% |
|
| 47 |
|
| 48 |
## Model Details
|
| 49 |
|
| 50 |
+
- **Model type:** Multi-label image classification (sigmoid outputs)
|
| 51 |
+
- **Architecture:** Vision Transformer Large (ViT-L/16), ~304M parameters
|
| 52 |
+
- **Input:** 224 x 224 RGB images
|
| 53 |
+
- **Output:** 2 logits (flower, fruit) — apply sigmoid for probabilities
|
| 54 |
+
- **Pretraining:** PlantCLEF 2022 checkpoint ("virtual taxonomist" — trained on 2.9M plant species images)
|
| 55 |
+
- **Current version:** v1.1.0
|
| 56 |
+
- **Model DOI:** [10.57967/hf/7952](https://doi.org/10.57967/hf/7952)
|
| 57 |
+
- **Developer:** [Phenobase](https://phenobase.org/)
|
| 58 |
+
- **Repository:** [github.com/Phenobase/phenovision](https://github.com/Phenobase/phenovision)
|
| 59 |
+
- **License:** MIT
|
| 60 |
|
| 61 |
+
### Key Innovation: Virtual Taxonomist Pretraining
|
| 62 |
|
| 63 |
+
Instead of standard ImageNet pretraining, PhenoVision uses a ViT-Large checkpoint pretrained on the PlantCLEF 2022 dataset (2.9 million plant images for species classification). Since species classification relies heavily on recognizing reproductive structures (flowers, fruits), this domain-specific pretraining provides a strong initialization for phenology detection. Compared to ImageNet pretraining, PlantCLEF pretraining achieved:
|
| 64 |
|
| 65 |
+
- Higher accuracy: TSS = 0.864 vs. 0.835
|
| 66 |
+
- Faster convergence: Best epoch at 4 vs. 11
|
| 67 |
|
| 68 |
+
## Intended Uses
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
+
**Primary use:** Detecting the presence of flowers and/or fruits in field photographs of plants.
|
| 71 |
|
| 72 |
+
**Suitable for:**
|
| 73 |
+
- Automated phenology annotation of iNaturalist and other community science images
|
| 74 |
+
- Large-scale phenology monitoring and climate change research
|
| 75 |
+
- Generating presence-only reproductive phenology datasets
|
| 76 |
+
- Integration with phenology databases (e.g., [Phenobase](https://phenobase.org/), USA-NPN)
|
| 77 |
|
| 78 |
+
**Out of scope:**
|
| 79 |
+
- Counting individual flowers or fruits
|
| 80 |
+
- Distinguishing flower developmental stages (buds vs. open vs. senescent)
|
| 81 |
+
- Detecting leaf phenology (use [PhenoVisionL](https://huggingface.co/phenobase/phenovisionL) instead)
|
| 82 |
+
- Identifying plant species (this is a phenology model, not a taxonomic classifier)
|
| 83 |
|
| 84 |
+
## How to Use
|
| 85 |
|
| 86 |
+
```python
|
| 87 |
+
from transformers import ViTForImageClassification, ViTImageProcessor
|
| 88 |
+
from PIL import Image
|
| 89 |
+
import torch
|
| 90 |
|
| 91 |
+
# Load model and processor
|
| 92 |
+
processor = ViTImageProcessor.from_pretrained("phenobase/phenovision")
|
| 93 |
+
model = ViTForImageClassification.from_pretrained("phenobase/phenovision")
|
| 94 |
+
model.eval()
|
| 95 |
|
| 96 |
+
# Run inference
|
| 97 |
+
image = Image.open("plant_photo.jpg").convert("RGB")
|
| 98 |
+
inputs = processor(images=image, return_tensors="pt")
|
| 99 |
|
| 100 |
+
with torch.no_grad():
|
| 101 |
+
outputs = model(**inputs)
|
| 102 |
+
probs = torch.sigmoid(outputs.logits)[0]
|
| 103 |
|
| 104 |
+
flower_prob = probs[0].item()
|
| 105 |
+
fruit_prob = probs[1].item()
|
| 106 |
|
| 107 |
+
print(f"Flower: {flower_prob:.3f}")
|
| 108 |
+
print(f"Fruit: {fruit_prob:.3f}")
|
| 109 |
+
```
|
| 110 |
|
| 111 |
+
### Applying Thresholds
|
| 112 |
|
| 113 |
+
Raw probabilities should be converted to detection calls using the optimized thresholds and uncertainty buffers provided as companion files. Predictions falling within the buffer zone are classified as "Equivocal" and should be excluded for research-quality outputs.
|
| 114 |
|
| 115 |
+
| Class | Threshold | Buffer Lower | Buffer Upper | Equivocal Range |
|
| 116 |
+
|-------|-----------|--------------|--------------|-----------------|
|
| 117 |
+
| Flower | 0.48 | 0.325 | 0.385 | 0.155 - 0.865 |
|
| 118 |
+
| Fruit | 0.60 | 0.405 | 0.305 | 0.195 - 0.905 |
|
| 119 |
|
| 120 |
+
- Probability **above** (threshold + buffer_upper) → **Detected** (high certainty)
|
| 121 |
+
- Probability **below** (threshold - buffer_lower) → **Not Detected** (high certainty)
|
| 122 |
+
- Probability **within** buffer zone → **Equivocal** (exclude from analysis)
|
| 123 |
|
| 124 |
+
## Training Data
|
| 125 |
|
| 126 |
+
- **Source:** [iNaturalist](https://www.inaturalist.org/) open data (research-grade observations)
|
| 127 |
+
- **Size:** 1,535,930 images from 119,340 species across 10,406 genera and 408 plant families
|
| 128 |
+
- **Splits:** 60% train (921,720) / 20% validation (307,291) / 20% test (306,919), stratified by genus
|
| 129 |
+
- **Annotations:** Human phenology annotations from iNaturalist platform (reproductiveCondition field)
|
| 130 |
+
- **Licensing:** Images under CC-0, CC-BY, or CC-BY-NC licenses
|
| 131 |
+
- **Note:** Approximately 1-5% of training annotations are marked "unknown" due to annotation difficulty
|
| 132 |
|
| 133 |
+
## Training Procedure
|
| 134 |
|
| 135 |
+
- **Optimizer:** AdamW
|
| 136 |
+
- **Learning rate:** 5e-4 (base), with layer-wise decay factor 0.65
|
| 137 |
+
- **Batch size:** 384
|
| 138 |
+
- **Weight decay:** 0.05
|
| 139 |
+
- **Data augmentation:** RandAugment
|
| 140 |
+
- **Epochs:** 10 (best model selected at epoch 7 by average Data Quality Index)
|
| 141 |
+
- **Hardware:** NVIDIA A100 GPU
|
| 142 |
+
- **Loss:** Binary cross-entropy (multi-label)
|
| 143 |
+
- **v1.1.0 training:** Fine-tuned from v1.0.0 checkpoint on updated data snapshot (2025-10-27)
|
| 144 |
|
| 145 |
+
## Evaluation Results
|
| 146 |
|
| 147 |
+
### Test Set Performance (v1.1.0)
|
| 148 |
|
| 149 |
+
| Class | Filter | N | Accuracy | Sensitivity | Specificity | PPV | NPV | J-Index | F1 | DQI |
|
| 150 |
+
|-------|--------|---|----------|-------------|-------------|-----|-----|---------|-----|-----|
|
| 151 |
+
| Flower | All data | 713,698 | 95.77% | 96.93% | 93.72% | 96.45% | 94.54% | 0.907 | 96.69% | 0.934 |
|
| 152 |
+
| Flower | Buffer filtered | 663,738 | 98.02% | 98.47% | 97.19% | 98.48% | 97.19% | 0.957 | 98.48% | 0.970 |
|
| 153 |
+
| Fruit | All data | 713,698 | 94.33% | 77.33% | 98.04% | 89.64% | 95.18% | 0.754 | 83.03% | 0.670 |
|
| 154 |
+
| Fruit | Buffer filtered | 651,791 | 97.01% | 84.16% | 99.37% | 96.11% | 97.16% | 0.835 | 89.74% | 0.803 |
|
| 155 |
|
| 156 |
+
### Expert Validation
|
| 157 |
|
| 158 |
+
Independent expert review of model predictions:
|
| 159 |
+
- **Flower presence:** 98.6% agreement
|
| 160 |
+
- **Fruit presence:** 90.4% agreement
|
| 161 |
|
| 162 |
+
### Taxonomic Coverage
|
| 163 |
|
| 164 |
+
- **Species:** 119,340 from 10,406 genera and 408 families
|
| 165 |
+
- **Genera with 10+ records:** 7,409 (flowers), 5,240 (fruits)
|
| 166 |
+
- **Median records per genus:** 184 (flowers), 85 (fruits)
|
| 167 |
+
- **New geographic grid cells:** 3,798 (flowers), 4,147 (fruits) with no prior phenology data
|
| 168 |
|
| 169 |
+
## Companion Files
|
| 170 |
+
|
| 171 |
+
The following files are uploaded alongside the model weights:
|
| 172 |
+
|
| 173 |
+
| File | Description |
|
| 174 |
+
|------|-------------|
|
| 175 |
+
| `final_buffer_params.csv` | Decision thresholds and uncertainty buffer parameters per class. Used to convert probabilities to Detected/Not Detected/Equivocal calls. |
|
| 176 |
+
| `family_stats.csv` | Per-family (706 families) accuracy statistics. Useful for assessing model reliability for specific taxonomic groups. |
|
| 177 |
|
| 178 |
+
## Limitations and Biases
|
| 179 |
+
|
| 180 |
+
### Design Limitations
|
| 181 |
+
- **Presence-only:** The model reports detections but NOT absences. A non-detection does not mean the plant lacks flowers/fruits — it may simply not be visible in the image.
|
| 182 |
+
- **Partial plant coverage:** Images typically show only part of a plant. Reproductive structures may exist on non-photographed parts.
|
| 183 |
+
- **Buffer zone data loss:** Applying uncertainty thresholds removes ~7-9% of predictions as equivocal, trading completeness for accuracy.
|
| 184 |
|
| 185 |
+
### Known Failure Modes
|
| 186 |
+
- Inconspicuous reproductive structures (grasses, sedges) are harder to detect
|
| 187 |
+
- Flower buds may be confused with open flowers
|
| 188 |
+
- Background plants with flowers/fruits can cause false positives for the focal plant
|
| 189 |
+
- Some families show lower accuracy (e.g., Haloragaceae ~79%)
|
| 190 |
|
| 191 |
+
### Data Biases
|
| 192 |
+
- Reflects iNaturalist's geographic biases: overrepresentation of urban areas, developed countries, and coastal regions
|
| 193 |
+
- Taxonomic bias toward common, conspicuous species
|
| 194 |
+
- Limited coverage in biodiversity-rich tropical regions
|
| 195 |
|
| 196 |
+
### Annotation Quality
|
| 197 |
+
- Training labels come from community science annotations with inherent variability
|
| 198 |
+
- Some iNaturalist annotations are incomplete (e.g., flower present but only fruit annotated)
|
| 199 |
+
- Family-level accuracy statistics (in `family_stats.csv`) should be consulted when interpreting results for specific taxonomic groups
|
| 200 |
|
| 201 |
+
## Citation
|
| 202 |
|
| 203 |
+
If you use PhenoVision in your research, please cite:
|
| 204 |
|
| 205 |
+
```bibtex
|
| 206 |
+
@article{dinnage2025phenovision,
|
| 207 |
+
title={PhenoVision: A framework for automating and delivering research-ready plant phenology data from field images},
|
| 208 |
+
author={Dinnage, Russell and Grady, Erin and Neal, Nevyn and Deck, Jonn and Denny, Ellen and Walls, Ramona and Seltzer, Carrie and Guralnick, Robert and Li, Daijiang},
|
| 209 |
+
journal={Methods in Ecology and Evolution},
|
| 210 |
+
volume={16},
|
| 211 |
+
pages={1763--1780},
|
| 212 |
+
year={2025},
|
| 213 |
+
doi={10.1111/2041-210X.14346}
|
| 214 |
+
}
|
| 215 |
+
```
|
| 216 |
+
|
| 217 |
+
## Acknowledgments
|
| 218 |
+
|
| 219 |
+
- **Funding:** National Science Foundation (NSF)
|
| 220 |
+
- **Data:** [iNaturalist](https://www.inaturalist.org/) community and platform
|
| 221 |
+
- **Infrastructure:** [Phenobase](https://phenobase.org/) — a global plant phenology database
|
| 222 |
+
- **Integration:** Plant Phenology Ontology (PPO), USA National Phenology Network (USA-NPN)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|