Patrick Daniel commited on
Commit
fd25d05
·
1 Parent(s): f86b3dc
Files changed (3) hide show
  1. .DS_Store +0 -0
  2. README.md +200 -11
  3. app.py +3 -3
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
README.md CHANGED
@@ -1,14 +1,203 @@
1
  ---
2
- title: IfcbSingleClassifier
3
- emoji: 👀
4
- colorFrom: green
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 5.34.2
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: Classify Imaging FlowCytobot images
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - patcdaniel/Phytoplankton-UCSC-IFCB-20250801
4
+ pipeline_tag: image-classification
5
+ base_model: google/vit-base-patch16-224
6
+ tags:
7
+ - image-classification
8
+ - vision-transformer
9
+ - phytoplankton
10
+ - oceanography
11
+ - marine-science
12
+ license: apache-2.0
13
+ model_name: phytoViT_558k_Aug2025
14
+ finetuned_from: google/vit-base-patch16-224-in21k
15
  ---
16
 
17
+ # Model Card for phytoViT_558k_Aug2025
18
+
19
+ ## Model Details
20
+
21
+ ### Model Description
22
+
23
+ UCSCPhytoViT83 is a Vision Transformer (ViT) model fine-tuned for image classification of phytoplankton species using labeled images collected from the Imaging FlowCytobot (IFCB) at UCSC. The model is fine-tuned from the pre-trained `google/vit-base-patch16-224-in21k` base model. The model was trained on images that are aggregated from [IFCB104](https://ifcb.caloos.org/timeline?dataset=santa-cruz-municipal-wharf), [IFCB161](https://ifcb.caloos.org/timeline?dataset=mbari-power-buoy), and [IFCB116](https://ifcb.caloos.org/timeline?dataset=san-francisco-bay-cruises)
24
+
25
+ - **Developed by:** Patrick Daniel
26
+ - **Model type:** Vision Transformer for image classification
27
+ - **License:** Apache 2.0
28
+ - **Finetuned from model:** google/vit-base-patch16-224-in21k
29
+
30
+ ### Model Sources
31
+
32
+ - **Repository:** [More Information Needed]
33
+
34
+ ## Uses
35
+
36
+ ### Direct Use
37
+
38
+ This model can be used directly for classifying phytoplankton images captured by Imaging FlowCytobots instruments. Focus has been on capturing the variability of the phytoplankton community in Monterey Bay, CA, USA. It is intended for researchers.
39
+
40
+
41
+
42
+ Images should be transformed before inference:
43
+
44
+ ```python
45
+ transforms.Compose([
46
+ transforms.Resize((224, 224)), # match ViT input size
47
+ transforms.Normalize(mean=(0.485, 0.456, 0.406),
48
+ std=(0.229, 0.224, 0.225))
49
+ ])
50
+ ```
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ This model is not intended for classifying non-phytoplankton images or images from different microscopy systems without proper retraining or adaptation. For the IFCB, the model was trained for instruments that are triggering on PMT-B, so particles and cells with no or limited chlorophyll may not be well represented here.
55
+
56
+ ## Bias, Risks, and Limitations
57
+
58
+ The model was trained on IFCB images collected at UCSC/MBARI and mostly in Monterey Bay, CA, USA or San Francisco Bay, CA, USA and may not generalize well to images from other instruments or regions. Users should validate model predictions with domain experts when possible.
59
+
60
+ ## How to Get Started with the Model
61
+
62
+ Install the `transformers` library and load the model as shown in the example above. For best results, use images preprocessed similarly to the IFCB dataset (see above).
63
+
64
+ ## Training Details
65
+
66
+ ### Training Data
67
+
68
+ The model was trained on approximately 558,000 labeled IFCB images representing 83 classes.
69
+ ### Training Procedure
70
+
71
+ - **Preprocessing:** Images were resized and normalized consistent with ViT base requirements.
72
+
73
+ ## Evaluation
74
+
75
+ ### Testing Data, Factors & Metrics
76
+
77
+ - The model was evaluated on a held-out test set of IFCB images.
78
+ - Metrics include accuracy, precision, recall, and F1-score across phytoplankton classes.
79
+
80
+ ### Results
81
+
82
+ | Label Name | precision | recall | f1-score | Eval #|
83
+ |:-------------------------------|------------:|---------:|-----------:|--------------:|
84
+ | Akashiwo | 0.980405 | 0.984656 | 0.982526 | 2998 |
85
+ | Alexandrium | 0.972328 | 0.968642 | 0.970482 | 2902 |
86
+ | Amylax_Gonyaulax_Protoceratium | 0.987234 | 0.983051 | 0.985138 | 236 |
87
+ | Asterionellopsis | 0.982877 | 0.979522 | 0.981197 | 586 |
88
+ | Asteromphalus | 0.990488 | 0.988138 | 0.989311 | 843 |
89
+ | Bad_setae | 0.981581 | 0.969674 | 0.975591 | 1319 |
90
+ | Centric | 0.886133 | 0.848779 | 0.867054 | 2989 |
91
+ | Ceratium_divaricatum | 0.994825 | 0.97096 | 0.982748 | 792 |
92
+ | Ceratium_furca | 0.962202 | 0.966172 | 0.964183 | 1212 |
93
+ | Ceratium_lineatum | 0.975992 | 0.986287 | 0.981112 | 1896 |
94
+ | Chaetoceros | 0.944537 | 0.948 | 0.946265 | 3000 |
95
+ | Ciliate_large | 0.958333 | 0.974576 | 0.966387 | 118 |
96
+ | Ciliate_large_2 | 0.959091 | 0.96789 | 0.96347 | 218 |
97
+ | Ciliate_other_morpho_1 | 0.915578 | 0.918347 | 0.91696 | 992 |
98
+ | Clusterflagellate_morpho_1 | 0.994539 | 0.982468 | 0.988467 | 1483 |
99
+ | Clusterflagellate_morpho_2 | 0.992734 | 0.996354 | 0.99454 | 1097 |
100
+ | Corethron | 0.998889 | 0.99778 | 0.998334 | 901 |
101
+ | Cryptophyte | 0.951977 | 0.968391 | 0.960114 | 1740 |
102
+ | Cylindrotheca | 0.925259 | 0.969143 | 0.946693 | 1750 |
103
+ | Detonula_Cerataulina_Lauderia | 0.840866 | 0.880667 | 0.860306 | 3000 |
104
+ | Detritus | 0.971975 | 0.987915 | 0.97988 | 2317 |
105
+ | Detritus_infection | 0.996717 | 0.996308 | 0.996513 | 2438 |
106
+ | Dictyocha | 0.997705 | 0.995421 | 0.996562 | 2184 |
107
+ | Dinoflagellate_cyst | 1 | 1 | 1 | 17 |
108
+ | Dinoflagellate_morpho_1 | 0.95098 | 0.984772 | 0.967581 | 394 |
109
+ | Dinoflagellate_morpho_2 | 0.93253 | 0.940081 | 0.93629 | 2470 |
110
+ | Dinophysis | 0.986971 | 0.988581 | 0.987775 | 1226 |
111
+ | Ditylum | 0.994619 | 0.996406 | 0.995512 | 1113 |
112
+ | Entomoneis | 0.972626 | 0.978485 | 0.975547 | 1162 |
113
+ | Eucampia | 0.977153 | 0.926667 | 0.95124 | 3000 |
114
+ | Euglenoid | 0.972408 | 0.965145 | 0.968763 | 2410 |
115
+ | Flagellate_morpho_1 | 0.966153 | 0.96132 | 0.963731 | 2999 |
116
+ | Flagellate_morpho_2 | 0.942211 | 0.974026 | 0.957854 | 385 |
117
+ | Flagellate_morpho_3 | 0.951259 | 0.969333 | 0.960211 | 3000 |
118
+ | Flagellate_nano_1 | 0.956818 | 0.981352 | 0.96893 | 429 |
119
+ | Flagellate_nano_2 | 0.988124 | 0.978824 | 0.983452 | 425 |
120
+ | Fragilariopsis | 0.900064 | 0.939667 | 0.919439 | 3000 |
121
+ | Guinardia_Dactyliosolen | 0.806818 | 0.913603 | 0.856897 | 544 |
122
+ | Gymnodinium | 0.830748 | 0.867452 | 0.848703 | 679 |
123
+ | Gyrodinium | 0.988604 | 0.991429 | 0.990014 | 1050 |
124
+ | Gyrosigma | 0.946237 | 0.946237 | 0.946237 | 93 |
125
+ | Haptophyte_prymnesium | 0.622642 | 0.673469 | 0.647059 | 49 |
126
+ | Hemiaulus | 0.903226 | 0.903226 | 0.903226 | 155 |
127
+ | Hemiselmis | 0.950862 | 0.974 | 0.962292 | 3000 |
128
+ | Heterocapsa_long | 0.958763 | 0.894231 | 0.925373 | 104 |
129
+ | Heterocapsa_rotundata | 0.964509 | 0.884211 | 0.922616 | 1045 |
130
+ | Heterocapsa_triquetra | 0.803571 | 0.656934 | 0.722892 | 137 |
131
+ | Heterosigma_akashiwo | 1 | 0.998477 | 0.999238 | 1313 |
132
+ | Laboea | 0.990521 | 0.987402 | 0.988959 | 635 |
133
+ | Leptocylindrus | 0.965558 | 0.949766 | 0.957597 | 856 |
134
+ | Margalefidinium | 0.973141 | 0.975378 | 0.974258 | 3046 |
135
+ | Mesodinium | 0.9583 | 0.962933 | 0.960611 | 2482 |
136
+ | Nano_cluster | 0.982955 | 0.997118 | 0.989986 | 347 |
137
+ | Nano_p_white | 0.982298 | 0.975951 | 0.979114 | 2786 |
138
+ | Noctiluca | 1 | 0.965517 | 0.982456 | 29 |
139
+ | Odontella | 1 | 1 | 1 | 30 |
140
+ | Pennate | 0.909332 | 0.864695 | 0.886452 | 3178 |
141
+ | Pennate_Tropidoneis | 0.837209 | 0.742268 | 0.786885 | 97 |
142
+ | Pennate_Unknown | 0.84127 | 0.828125 | 0.834646 | 64 |
143
+ | Pennate_small | 0.843373 | 0.864198 | 0.853659 | 405 |
144
+ | Peridinium | 0.968435 | 0.969086 | 0.96876 | 1488 |
145
+ | Phaeocystis | 0.994502 | 0.997931 | 0.996213 | 1450 |
146
+ | Pleurosigma | 0.991379 | 0.963149 | 0.97706 | 597 |
147
+ | Polykrikos | 0.997099 | 0.995174 | 0.996135 | 1036 |
148
+ | Proboscia | 0.992593 | 0.985294 | 0.98893 | 136 |
149
+ | Prorocentrum_narrow | 0.981952 | 0.981952 | 0.981952 | 2992 |
150
+ | Prorocentrum_wide | 0.988893 | 0.991463 | 0.990176 | 2694 |
151
+ | Pseudo-nitzschia | 0.956324 | 0.977674 | 0.966881 | 1075 |
152
+ | Pyramimonas | 1 | 0.982379 | 0.991111 | 227 |
153
+ | Rhizosolenia | 0.996008 | 0.984221 | 0.990079 | 507 |
154
+ | Scrippsiella | 0.960588 | 0.931015 | 0.94557 | 1754 |
155
+ | Skeletonema | 0.98632 | 0.993113 | 0.989705 | 1452 |
156
+ | Spiky_pacman | 0.961072 | 0.958908 | 0.959989 | 3553 |
157
+ | Stombidinium_morpho_1 | 0.919847 | 0.909434 | 0.914611 | 265 |
158
+ | Strombidinum_morpho_2 | 0.966399 | 0.940633 | 0.953342 | 2813 |
159
+ | Thalassionema | 0.989882 | 0.991554 | 0.990717 | 592 |
160
+ | Thalassiosira | 0.924272 | 0.931667 | 0.927955 | 3000 |
161
+ | Tiarina | 0.997843 | 0.996767 | 0.997305 | 928 |
162
+ | Tontonia | 0.954167 | 0.938525 | 0.946281 | 244 |
163
+ | Torodinium | 0.994792 | 0.990493 | 0.992638 | 1157 |
164
+ | Tropidoneis | 1 | 0.993569 | 0.996774 | 311 |
165
+ | Vicicitus | 0.943284 | 0.954683 | 0.948949 | 331 |
166
+ | haptophyte_ucynA_host | 1 | 0.998532 | 0.999265 | 2043 |
167
+ | accuracy | 0.958662 | 0.958662 | 0.958662 | 0.958662 |
168
+ | macro avg | 0.953973 | 0.951658 | 0.952527 | 111810 |
169
+ | weighted avg | 0.958948 | 0.958662 | 0.958652 | 111810 |
170
+
171
+ ![Confusion Matrix](TrainingResults/confusion_matrix_heatmap.png)
172
+
173
+
174
+ ## Technical Specifications
175
+
176
+ ### Model Architecture and Objective
177
+
178
+ ## Citation
179
+
180
+ If you use this model in your research, please cite:
181
+
182
+ **APA:**
183
+
184
+ Daniel, P. (2025). phytoViT_558k_Aug2025: Vision Transformer model for phytoplankton image classification. Retrieved from https://huggingface.co/phytoViT_558k_Aug2025
185
+
186
+ **BibTeX:**
187
+
188
+ ```
189
+ @misc{daniel2025phytoViT,
190
+ author = {Patrick Daniel},
191
+ title = {phytoViT_558k_Aug2025: Vision Transformer model for phytoplankton image classification},
192
+ year = {2025},
193
+ howpublished = {\url{https://huggingface.co/phytoViT_558k_Aug2025}},
194
+ }
195
+ ```
196
+
197
+ ## Model Card Authors
198
+
199
+ Patrick Daniel
200
+
201
+ ## Model Card Contact
202
+
203
+ pcdaniel@ucsc.edu
app.py CHANGED
@@ -13,7 +13,7 @@ import json
13
 
14
  # Download the file from your model repo (replace with your actual token if private)
15
  model_path = hf_hub_download(
16
- repo_id="patcdaniel/phytoViT_508k_20250611",
17
  filename="model.safetensors",
18
  token=os.environ.get("HF_TOKEN") # omit this line if public
19
  )
@@ -21,12 +21,12 @@ state_dict = load_file(model_path)
21
 
22
  model = ViTForImageClassification.from_pretrained(
23
  "google/vit-base-patch16-224-in21k",
24
- num_labels=95 # this must match your training
25
  )
26
  model.load_state_dict(state_dict)
27
 
28
  model_path = hf_hub_download(
29
- repo_id="patcdaniel/phytoViT_508k_20250611",
30
  filename="label_names.json",
31
  token=os.environ.get("HF_TOKEN"),
32
  local_dir="."
 
13
 
14
  # Download the file from your model repo (replace with your actual token if private)
15
  model_path = hf_hub_download(
16
+ repo_id="patcdaniel/UCSCPhytoViT83",
17
  filename="model.safetensors",
18
  token=os.environ.get("HF_TOKEN") # omit this line if public
19
  )
 
21
 
22
  model = ViTForImageClassification.from_pretrained(
23
  "google/vit-base-patch16-224-in21k",
24
+ num_labels=83 # this must match your training
25
  )
26
  model.load_state_dict(state_dict)
27
 
28
  model_path = hf_hub_download(
29
+ repo_id="patcdaniel/UCSCPhytoViT83",
30
  filename="label_names.json",
31
  token=os.environ.get("HF_TOKEN"),
32
  local_dir="."