Update README.md
Browse files
README.md
CHANGED
|
@@ -35,11 +35,9 @@ This model was designed and trained to work with IFCB data generated in Monterey
|
|
| 35 |
|
| 36 |
Independent model validation should be used when applying the model to other sites.
|
| 37 |
|
| 38 |
-
Review section 4.2 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|
| 39 |
-
|
| 40 |
### Primary intended uses
|
| 41 |
|
| 42 |
-
Generalized phytoplankton classifier for common taxa found in the Monterey Bay.
|
| 43 |
|
| 44 |
### Primary intended users
|
| 45 |
|
|
@@ -57,68 +55,34 @@ Model classes were chosen based on common and resolvable phytoplankton taxa. Tax
|
|
| 57 |
|
| 58 |
Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation
|
| 59 |
|
| 60 |
-
Review section 4.3 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|
| 61 |
-
|
| 62 |
-
### Relevant factors
|
| 63 |
-
|
| 64 |
-
### Evaluation factors
|
| 65 |
|
| 66 |
## Metrics
|
| 67 |
|
| 68 |
-
|
| 69 |
-
For example, classification systems in which the primary output is a class label differ significantly
|
| 70 |
-
from systems whose primary output is a score. In all cases, the reported metrics should be determined
|
| 71 |
-
based on the model’s structure and intended use._
|
| 72 |
-
|
| 73 |
-
Review section 4.4 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|
| 74 |
|
| 75 |
### Model performance measures
|
| 76 |
-
|
| 77 |
-
### Decision thresholds
|
| 78 |
|
| 79 |
### Approaches to uncertainty and variability
|
| 80 |
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
_All referenced datasets would ideally point to any set of documents that provide visibility into the
|
| 84 |
-
source and composition of the dataset. Evaluation datasets should include datasets that are publicly
|
| 85 |
-
available for third-party use. These could be existing datasets or new ones provided alongside the model
|
| 86 |
-
card analyses to enable further benchmarking._
|
| 87 |
-
|
| 88 |
-
Review section 4.5 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|
| 89 |
-
|
| 90 |
-
### Datasets
|
| 91 |
-
|
| 92 |
-
### Motivation
|
| 93 |
-
|
| 94 |
-
### Preprocessing
|
| 95 |
|
| 96 |
## Training data
|
| 97 |
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
## Quantitative analyses
|
| 101 |
-
|
| 102 |
-
_Quantitative analyses should be disaggregated, that is, broken down by the chosen factors. Quantitative
|
| 103 |
-
analyses should provide the results of evaluating the model according to the chosen metrics, providing
|
| 104 |
-
confidence interval values when possible._
|
| 105 |
|
| 106 |
-
|
| 107 |
|
| 108 |
-
|
| 109 |
|
| 110 |
-
### Intersectional result
|
| 111 |
|
| 112 |
-
##
|
| 113 |
|
| 114 |
-
|
| 115 |
|
| 116 |
-
### Data
|
| 117 |
|
| 118 |
-
### Use cases
|
| 119 |
|
| 120 |
-
|
| 121 |
|
| 122 |
-
_This section should list additional concerns that were not covered in the previous sections._
|
| 123 |
|
| 124 |
Review section 4.9 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|
|
|
|
| 35 |
|
| 36 |
Independent model validation should be used when applying the model to other sites.
|
| 37 |
|
|
|
|
|
|
|
| 38 |
### Primary intended uses
|
| 39 |
|
| 40 |
+
Generalized micro-phytoplankton classifier for common taxa found in the Monterey Bay.
|
| 41 |
|
| 42 |
### Primary intended users
|
| 43 |
|
|
|
|
| 55 |
|
| 56 |
Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation
|
| 57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
## Metrics
|
| 60 |
|
| 61 |
+
_Deployed model performance will vary with the natural variabilability in the observed phytoplankton communities over different time scales (seasonality). As such model performance should be evaluated throughout IFCb deployments using independently labled images._
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
### Model performance measures
|
| 64 |
+
Training model performace was evaluated using a held-back validation training set. F1-scores were calcuated for each class. [See Results here](https://stage-habdac-streamlit.srv.axds.co/Model_Metrics)
|
|
|
|
| 65 |
|
| 66 |
### Approaches to uncertainty and variability
|
| 67 |
|
| 68 |
+
Uncertainty is addressed by applying a set of class-specific thresholds for each prediction. This works reasonably well for out-of-distribution images.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
## Training data
|
| 71 |
|
| 72 |
+
To Be Described
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
|
| 74 |
+
## Ethical considerations
|
| 75 |
|
| 76 |
+
None
|
| 77 |
|
|
|
|
| 78 |
|
| 79 |
+
## Caveats and recommendations
|
| 80 |
|
| 81 |
+
This model was developed as in interation of previous classification efforts and as such is subject to a history of decision making that is not captured here. For that reasons this classifier is not a panacea for all phytoplankton image data, but was specifically developed for looking at phytoplankton communities in Monterey Bay.
|
| 82 |
|
|
|
|
| 83 |
|
|
|
|
| 84 |
|
| 85 |
+
IFCB collected data are very context specific and subject to both observation configurations and small-scale variability.
|
| 86 |
|
|
|
|
| 87 |
|
| 88 |
Review section 4.9 of the [model cards paper](https://arxiv.org/abs/1810.03993).
|