Devi-Ayyagari's picture
Update README.md
9ffa9df verified
metadata
library_name: pytorch
tags:
  - computer-vision
  - object-detection
  - image-classification
  - yolov7
  - marine-biology
pipeline_tag: object-detection
model-index:
  - name: YOLOv7 Baselines on N-MARINE
    results:
      - task:
          type: object-detection
          name: Object Detection
        dataset:
          type: other
          name: N-MARINE
          url: >-
            https://open.canada.ca/data/en/dataset/2ae46860-f82a-4127-bb1f-b02e36ef6a70
          split: test
          citation: >
            Morris, C. J., Ayyagari, K. D., Porter, D., Nguyen, Q. K., Hanlon,
            J., & Whidden, C. (2025).

            *Newfoundland Marine Refuge Fish Classification Dataset (N-Marine)*.

            Government of Canada Open Data Portal.

            https://open.canada.ca/data/en/dataset/2ae46860-f82a-4127-bb1f-b02e36ef6a70
        metrics:
          - name: mAP@0.5
            type: mAP
            value: 0.808
          - name: mAP@0.5:0.95
            type: mAP
            value: 0.494
          - name: precision
            type: precision
            value: 0.807
          - name: recall
            type: recall
            value: 0.764

YOLOv7 Baselines for N-MARINE

This repo hosts baseline YOLOv7 models trained on the N-MARINE dataset (North Atlantic underwater images with 9 fish species + background).

  • Best baseline (no class weights)
    mAP@0.5 0.808 ± 0.007 · mAP@[0.5:0.95] 0.494 ± 0.008 · P 0.807 ± 0.036 · R 0.764 ± 0.014
  • Paper: TODO – add link when available

Dataset: N-MARINE
Supplementary + scripts: https://github.com/Pentaerythrittetranitrat/N-MARINE_dataset_supplementary

Model list

The released models follow this directory structure, where each datasetX (X = 1–5) corresponds to one of the five cross-validation splits:

Each datasetX directory contains two trained models:

  • classweights/best.pt — trained with inverse-frequency class weights.
  • no_classweights/best.pt — trained without class weights (baseline, recommended).

The dataset splits used for training can be found in the supplementary repository:
https://github.com/Pentaerythrittetranitrat/N-MARINE_dataset_supplementary

Note: The models under dataset5 correspond to the most balanced split across classes.

Each model outputs 9 classes. The class order (index → class) for all models is:

Index Class name
0 Atlantic Cod
1 Roughhead Grenadier
2 Atlantic Halibut
3 Redfish Mentella
4 Thorny Skate
5 Striped Wolffish
6 Spinytail Skate
7 Whelk
8 Northern Wolffish

Intended use

  • Benchmarking object detection on North Atlantic underwater imagery
  • Studying class imbalance, visibility limits (turbidity/occlusion), and domain shifts
  • Generating crops for downstream species classification tasks

Training data and splits

  • Data: N-MARINE (23,936 images, 9 species + background)
  • Split protocol: fixed 15% video-level test; 5-fold CV within train videos
  • Pretraining: COCO weights (YOLOv7)
  • Image size: 640×640 letterboxed
  • Epochs: 50
  • Batch size: 32
  • Other: default YOLOv7 augmentations & hyperparams unless noted

Class weights variant

Inverse-frequency class weights slightly improved Spinytail Skate but reduced aggregate mAP.

Quick inference

These weights are YOLOv7-format PyTorch checkpoints. Use the YOLOv7 repository or a compatible runner.

CLI (YOLOv7)

# 1) Clone YOLOv7 (example URL; use the official repo you trained with)
git clone https://github.com/WongKinYiu/yolov7.git
cd yolov7
pip install -r requirements.txt

# 2) Run inference
python detect.py \
  --weights /path/to/N-MARINE_baseline_classifiers/n-marine_weights/dataset5/no_classweights/best.pt \
  --source /path/to/images_or_video \
  --img-size 640 \
  --conf-thres 0.25 \
  --iou-thres 0.65 \
  --save-txt --save-conf

Citation

If you use the dataset, please cite:

Dataset citation (plain text): Morris, C. J., Ayyagari, K. D., Porter, D., Nguyen, Q. K., Hanlon, J., & Whidden, C. (2025). Newfoundland Marine Refuge Fish Classification Dataset (N-Marine). Government of Canada Open Data Portal. https://open.canada.ca/data/en/dataset/2ae46860-f82a-4127-bb1f-b02e36ef6a70

If you use the models, please cite:

Model citation (plain text): TODO: Add