FathomNet
/

Dal-FathomNet-ST-Ensemble_itr-21

Image Classification

English

Model card Files Files and versions

xet

Community

MLIsaac commited on Jun 9, 2025

Commit

e29d0e5

verified ·

1 Parent(s): f0def76

Create README.md

Browse files

Files changed (1) hide show

README.md +41 -0

README.md ADDED Viewed

	@@ -0,0 +1,41 @@

+# Introduction
+This repository contains eight WideResNet-101-2 models trained by the Dal (Dalhousie University) team for the FathomNet 2025 competition, predictions from these models achieved 3rd place.
+These models are trained using distinct seeds and are intended to be used in the form of an ensemble.
+Each of these models contains in its corresponding folder: the model checkpoint file (containing weights), the predictions on the competition test dataset, and recorded training information.
+The overall process includes an iterative self-training pipeline, of which these models are the 21st iteration.
+# Intended Use
+The purpose of these models is to classify underwater imagery spanning the 79 leaf nodes in the FathomNet 2025.
+Each independent model in this ensemble possesses 100 classification heads, which are all capable of making predictions on the data.
+Confidence is then calculated based on the predicted probability distribution across these 100 heads, in an effort to capture epistemic uncertainty.
+The ensemble prediction set is then generated by taking the mode of predictions across these eight component models, with ties broken by average confidence.
+Further details on these models will be provided along with our GitHub code link when our report is finalized.
+# Factors
+Two main strategies appeared to be effective in our experimentation.
+We used a hierarchical distance-weighted modified version of cross-entropy loss and combined this with a self-training process, by which future training iterations learned on confident pseudo-labels for the test data, driven by earlier generations of models.
+# Metrics
+While we employed accuracy internally, the evaluated metric is hierarchical distance (based on hops from the ground truth annotation in a hierarchical tree.
+We implemented and used both in our experimentation.
+Ensemble iteration 21 attained a public distance (competition public leaderboard) score of 2.27, and a private distance (competition evaluation leaderboard) score of 1.83.
+# Training and Evaluation Data
+We employed both mentioned metrics in tuning hyperparameters, using a randomly split validation dataset taken from the training subset of data (typically about 20% of training data).
+Once we determined these optimal hyperparameters, we employed the full training dataset to make test predictions for submission.
+Over time, with the self-training process, increasingly confident pseudo-labelled test samples would be incrementally added to the training dataset, for further generations of models.
+Self-training performed in this fashion does not require ground truth for test annotations, and may be used for any downstream dataset of interest.
+# Deployment
+The models in question may be loaded and examined through PyTorch, and were implemented in a fairly standard approach using the library.
+The recommended resize setting is 112 px, as this was what we used to train these models.
+The normalization values are recommended to be the standard ImageNet settings, as these models employ Torchvision's pre-training on ImageNet.
+Additional information and code will be released in the near future along with updates to this model card.