File size: 2,107 Bytes
fc19214
 
7b3fbf5
 
 
dc5e11f
 
 
 
fc19214
 
7b3fbf5
fc19214
7b3fbf5
 
fc19214
7b3fbf5
 
 
fc19214
7b3fbf5
fc19214
dc5e11f
 
7b3fbf5
fc19214
7b3fbf5
 
 
 
fc19214
 
7b3fbf5
 
 
fc19214
 
 
7b3fbf5
 
 
 
 
fc19214
d730a0a
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
library_name: transformers
license: creativeml-openrail-m
base_model:
- facebook/detr-resnet-50-panoptic
datasets:
- FriedParrot/a-large-scale-fish-dataset
language:
- en
---

# Model Card for Fish Segmentation (Fine-Tuned DETR)

This is a **fine-tuned DETR model (`facebook/detr-resnet-50-panoptic`)** adapted for **fish detection and segmentation**.
The model performs **multi-task prediction** including:

* **Classification** (fish species recognition)
* **Bounding Box prediction**
* **Segmentation masks**

It has **42.9M parameters** and is trained on the **[A Large Scale Fish Dataset](https://www.kaggle.com/datasets/crowww/a-large-scale-fish-dataset)** from Kaggle.

The copy of this dataset on hugging face is available [here](https://huggingface.co/datasets/FriedParrot/a-large-scale-fish-dataset)

## Model Sources

* **Base model**: [facebook/detr-resnet-50-panoptic](https://huggingface.co/facebook/detr-resnet-50-panoptic)
* **Fine-tuned model**: [FriedParrot/fish-segmentation-simple](https://huggingface.co/FriedParrot/fish-segmentation-simple)
* **Training dataset**: [A Large Scale Fish Dataset](https://www.kaggle.com/datasets/crowww/a-large-scale-fish-dataset)
* **Source code & tutorials**: [GitHub Repository](https://github.com/FRIEDparrot/fish-segmentation)


> [!note]
> This model is fully compatible with `AutoModelForObjectDetection`, `AutoProcessor`, and Hugging Face Trainer.
> Unlike the first model (`fish-segmentation-model`), this one does **not** require custom config classes.

## Training Details

* **Hardware**: NVIDIA RTX 4090 (48GB VRAM)
* **CUDA**: 12.8
* **Framework**: PyTorch + Hugging Face Transformers
* **Batch size**: use 8 as train batch sizes
* **Training strategy**: Direct fine-tuning of DETR with minimal modifications

## Results & Example Predictions

Since its a fine-tuned model, the accuracy is really high, and also classification accuracy can reach about 100%.

The predicted bounding box and masks are also very accurate : 

![img](https://cdn-uploads.huggingface.co/production/uploads/67f350ddc96df22f6bf879ac/DN8Uyzn-LJeAVl6433zgO.png)