MLIsaac commited on
Commit
e29d0e5
·
verified ·
1 Parent(s): f0def76

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Introduction
2
+
3
+ This repository contains eight WideResNet-101-2 models trained by the Dal (Dalhousie University) team for the FathomNet 2025 competition, predictions from these models achieved 3rd place.
4
+ These models are trained using distinct seeds and are intended to be used in the form of an ensemble.
5
+ Each of these models contains in its corresponding folder: the model checkpoint file (containing weights), the predictions on the competition test dataset, and recorded training information.
6
+ The overall process includes an iterative self-training pipeline, of which these models are the 21st iteration.
7
+
8
+ # Intended Use
9
+
10
+ The purpose of these models is to classify underwater imagery spanning the 79 leaf nodes in the FathomNet 2025.
11
+ Each independent model in this ensemble possesses 100 classification heads, which are all capable of making predictions on the data.
12
+ Confidence is then calculated based on the predicted probability distribution across these 100 heads, in an effort to capture epistemic uncertainty.
13
+ The ensemble prediction set is then generated by taking the mode of predictions across these eight component models, with ties broken by average confidence.
14
+
15
+ Further details on these models will be provided along with our GitHub code link when our report is finalized.
16
+
17
+ # Factors
18
+
19
+ Two main strategies appeared to be effective in our experimentation.
20
+ We used a hierarchical distance-weighted modified version of cross-entropy loss and combined this with a self-training process, by which future training iterations learned on confident pseudo-labels for the test data, driven by earlier generations of models.
21
+
22
+ # Metrics
23
+
24
+ While we employed accuracy internally, the evaluated metric is hierarchical distance (based on hops from the ground truth annotation in a hierarchical tree.
25
+ We implemented and used both in our experimentation.
26
+ Ensemble iteration 21 attained a public distance (competition public leaderboard) score of 2.27, and a private distance (competition evaluation leaderboard) score of 1.83.
27
+
28
+ # Training and Evaluation Data
29
+
30
+ We employed both mentioned metrics in tuning hyperparameters, using a randomly split validation dataset taken from the training subset of data (typically about 20% of training data).
31
+ Once we determined these optimal hyperparameters, we employed the full training dataset to make test predictions for submission.
32
+ Over time, with the self-training process, increasingly confident pseudo-labelled test samples would be incrementally added to the training dataset, for further generations of models.
33
+ Self-training performed in this fashion does not require ground truth for test annotations, and may be used for any downstream dataset of interest.
34
+
35
+ # Deployment
36
+
37
+ The models in question may be loaded and examined through PyTorch, and were implemented in a fairly standard approach using the library.
38
+ The recommended resize setting is 112 px, as this was what we used to train these models.
39
+ The normalization values are recommended to be the standard ImageNet settings, as these models employ Torchvision's pre-training on ImageNet.
40
+
41
+ Additional information and code will be released in the near future along with updates to this model card.