English
patcdaniel commited on
Commit
a9ed331
·
verified ·
1 Parent(s): 6f07a81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -46
README.md CHANGED
@@ -35,11 +35,9 @@ This model was designed and trained to work with IFCB data generated in Monterey
35
 
36
  Independent model validation should be used when applying the model to other sites.
37
 
38
- Review section 4.2 of the [model cards paper](https://arxiv.org/abs/1810.03993).
39
-
40
  ### Primary intended uses
41
 
42
- Generalized phytoplankton classifier for common taxa found in the Monterey Bay. This
43
 
44
  ### Primary intended users
45
 
@@ -57,68 +55,34 @@ Model classes were chosen based on common and resolvable phytoplankton taxa. Tax
57
 
58
  Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation
59
 
60
- Review section 4.3 of the [model cards paper](https://arxiv.org/abs/1810.03993).
61
-
62
- ### Relevant factors
63
-
64
- ### Evaluation factors
65
 
66
  ## Metrics
67
 
68
- _The appropriate metrics to feature in a model card depend on the type of model that is being tested.
69
- For example, classification systems in which the primary output is a class label differ significantly
70
- from systems whose primary output is a score. In all cases, the reported metrics should be determined
71
- based on the model’s structure and intended use._
72
-
73
- Review section 4.4 of the [model cards paper](https://arxiv.org/abs/1810.03993).
74
 
75
  ### Model performance measures
76
-
77
- ### Decision thresholds
78
 
79
  ### Approaches to uncertainty and variability
80
 
81
- ## Evaluation data
82
-
83
- _All referenced datasets would ideally point to any set of documents that provide visibility into the
84
- source and composition of the dataset. Evaluation datasets should include datasets that are publicly
85
- available for third-party use. These could be existing datasets or new ones provided alongside the model
86
- card analyses to enable further benchmarking._
87
-
88
- Review section 4.5 of the [model cards paper](https://arxiv.org/abs/1810.03993).
89
-
90
- ### Datasets
91
-
92
- ### Motivation
93
-
94
- ### Preprocessing
95
 
96
  ## Training data
97
 
98
- Review section 4.6 of the [model cards paper](https://arxiv.org/abs/1810.03993).
99
-
100
- ## Quantitative analyses
101
-
102
- _Quantitative analyses should be disaggregated, that is, broken down by the chosen factors. Quantitative
103
- analyses should provide the results of evaluating the model according to the chosen metrics, providing
104
- confidence interval values when possible._
105
 
106
- Review section 4.7 of the [model cards paper](https://arxiv.org/abs/1810.03993).
107
 
108
- ### Unitary results
109
 
110
- ### Intersectional result
111
 
112
- ## Ethical considerations
113
 
114
- None
115
 
116
- ### Data
117
 
118
- ### Use cases
119
 
120
- ## Caveats and recommendations
121
 
122
- _This section should list additional concerns that were not covered in the previous sections._
123
 
124
  Review section 4.9 of the [model cards paper](https://arxiv.org/abs/1810.03993).
 
35
 
36
  Independent model validation should be used when applying the model to other sites.
37
 
 
 
38
  ### Primary intended uses
39
 
40
+ Generalized micro-phytoplankton classifier for common taxa found in the Monterey Bay.
41
 
42
  ### Primary intended users
43
 
 
55
 
56
  Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation
57
 
 
 
 
 
 
58
 
59
  ## Metrics
60
 
61
+ _Deployed model performance will vary with the natural variabilability in the observed phytoplankton communities over different time scales (seasonality). As such model performance should be evaluated throughout IFCb deployments using independently labled images._
 
 
 
 
 
62
 
63
  ### Model performance measures
64
+ Training model performace was evaluated using a held-back validation training set. F1-scores were calcuated for each class. [See Results here](https://stage-habdac-streamlit.srv.axds.co/Model_Metrics)
 
65
 
66
  ### Approaches to uncertainty and variability
67
 
68
+ Uncertainty is addressed by applying a set of class-specific thresholds for each prediction. This works reasonably well for out-of-distribution images.
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
  ## Training data
71
 
72
+ To Be Described
 
 
 
 
 
 
73
 
74
+ ## Ethical considerations
75
 
76
+ None
77
 
 
78
 
79
+ ## Caveats and recommendations
80
 
81
+ This model was developed as in interation of previous classification efforts and as such is subject to a history of decision making that is not captured here. For that reasons this classifier is not a panacea for all phytoplankton image data, but was specifically developed for looking at phytoplankton communities in Monterey Bay.
82
 
 
83
 
 
84
 
85
+ IFCB collected data are very context specific and subject to both observation configurations and small-scale variability.
86
 
 
87
 
88
  Review section 4.9 of the [model cards paper](https://arxiv.org/abs/1810.03993).