|
|
--- |
|
|
license: gpl-3.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- timm/wide_resnet101_2.tv_in1k |
|
|
pipeline_tag: image-classification |
|
|
--- |
|
|
# Introduction |
|
|
|
|
|
This repository contains eight WideResNet-101-2 models trained by the Dal (Dalhousie University) team for the FathomNet 2025 competition, predictions from these models achieved 3rd place. |
|
|
These models are trained using distinct seeds and are intended to be used in the form of an ensemble. |
|
|
Each of these models contains in its corresponding folder: the model checkpoint file (containing weights), the predictions on the competition test dataset, and recorded training information. |
|
|
The overall process includes an iterative self-training pipeline, of which these models are the 21st iteration. |
|
|
|
|
|
# Intended Use |
|
|
|
|
|
The purpose of these models is to classify underwater imagery spanning the 79 leaf nodes in the FathomNet 2025. |
|
|
Each independent model in this ensemble possesses 100 classification heads, which are all capable of making predictions on the data. |
|
|
Confidence is then calculated based on the predicted probability distribution across these 100 heads, in an effort to capture epistemic uncertainty. |
|
|
The ensemble prediction set is then generated by taking the mode of predictions across these eight component models, with ties broken by average confidence. |
|
|
|
|
|
Further details on these models will be provided along with our GitHub code link when our report is finalized. |
|
|
|
|
|
# Factors |
|
|
|
|
|
Two main strategies appeared to be effective in our experimentation. |
|
|
We used a hierarchical distance-weighted modified version of cross-entropy loss and combined this with a self-training process, by which future training iterations learned on confident pseudo-labels for the test data, driven by earlier generations of models. |
|
|
|
|
|
# Metrics |
|
|
|
|
|
While we employed accuracy internally, the evaluated metric is hierarchical distance (based on hops from the ground truth annotation in a hierarchical tree. |
|
|
We implemented and used both in our experimentation. |
|
|
Ensemble iteration 21 attained a public distance (competition public leaderboard) score of 2.27, and a private distance (competition evaluation leaderboard) score of 1.83. |
|
|
|
|
|
# Training and Evaluation Data |
|
|
|
|
|
We employed both mentioned metrics in tuning hyperparameters, using a randomly split validation dataset taken from the training subset of data (typically about 20% of training data). |
|
|
Once we determined these optimal hyperparameters, we employed the full training dataset to make test predictions for submission. |
|
|
Over time, with the self-training process, increasingly confident pseudo-labelled test samples would be incrementally added to the training dataset, for further generations of models. |
|
|
Self-training performed in this fashion does not require ground truth for test annotations, and may be used for any downstream dataset of interest. |
|
|
|
|
|
# Deployment |
|
|
|
|
|
The models in question may be loaded and examined through PyTorch, and were implemented in a fairly standard approach using the library. |
|
|
The recommended resize setting is 112 px, as this was what we used to train these models. |
|
|
The normalization values are recommended to be the standard ImageNet settings, as these models employ Torchvision's pre-training on ImageNet. |
|
|
|
|
|
Additional information and code will be released in the near future along with updates to this model card. |