Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- computer_vision
|
| 5 |
+
- pose_estimation
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
Copyright 2021-2023 by Mackenzie Mathis, Alexander Mathis, Shaokai Ye and contributors. All rights reserved.
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
- Please cite **Ye et al 2023** if you use this model in your work https://arxiv.org/abs/2203.07436v1
|
| 12 |
+
- If this license is not suitable for your business or project
|
| 13 |
+
please contact EPFL-TTO (https://tto.epfl.ch/) for a full commercial license.
|
| 14 |
+
|
| 15 |
+
This software may not be used to harm any animal deliberately!
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
**MODEL CARD:**
|
| 19 |
+
|
| 20 |
+
This model was trained a dataset called "Quadrupred-40K." It was trained in PyTorch within a modifed [mmpose framework](https://github.com/open-mmlab/mmpose), available within the [DeepLabCut framework](www.deeplabcut.org).
|
| 21 |
+
Full training details can be found in Ye et al. 2023, but in brief, this was trained with **HRNet**. We have another version available directly within the tensorflow version of DeepLabCut: https://huggingface.co/mwmathis/DeepLabCutModelZoo-SuperAnimal-Quadruped.
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
**Training Data:**
|
| 25 |
+
|
| 26 |
+
It consists of being trained together on the following datasets:
|
| 27 |
+
|
| 28 |
+
- **AwA-Pose** Quadruped dataset, see full details at (1).
|
| 29 |
+
- **AnimalPose** See full details at (2).
|
| 30 |
+
- **AcinoSet** See full details at (3).
|
| 31 |
+
- **Horse-30** Horse-30 dataset, benchmark task is called Horse-10; See full details at (4).
|
| 32 |
+
- **StanfordDogs** See full details at (5, 6).
|
| 33 |
+
- **AP-10K** See full details at (7).
|
| 34 |
+
- **iRodent** (https://zenodo.org/record/8250392) We utilized the iNaturalist API functions for scraping observations
|
| 35 |
+
with the taxon ID of Suborder Myomorpha (8). The functions allowed us to filter the large amount of observations down to the
|
| 36 |
+
ones with photos under the CC BY-NC creative license. The most common types of rodents from the collected observations are
|
| 37 |
+
Muskrat (Ondatra zibethicus), Brown Rat (Rattus norvegicus), House Mouse (Mus musculus), Black Rat (Rattus rattus), Hispid
|
| 38 |
+
Cotton Rat (Sigmodon hispidus), Meadow Vole (Microtus pennsylvanicus), Bank Vole (Clethrionomys glareolus), Deer Mouse
|
| 39 |
+
(Peromyscus maniculatus), White-footed Mouse (Peromyscus leucopus), Striped Field Mouse (Apodemus agrarius). We then
|
| 40 |
+
generated segmentation masks over target animals in the data by processing the media through an algorithm we designed that
|
| 41 |
+
uses a Mask Region Based Convolutional Neural Networks(Mask R-CNN) (9) model with a ResNet-50-FPN backbone (10),
|
| 42 |
+
pretrained on the COCO datasets (11). The processed 443 images were then manually labeled with both pose annotations and
|
| 43 |
+
segmentation masks.
|
| 44 |
+
|
| 45 |
+
Here is an image with the keypoint guide, the distribution of images per dataset, and examples from the datasets inferenced with a model trained with less data for benchmarking as in Ye et al 2023.
|
| 46 |
+
Thereby note that performance of this model we are releasing has comporable or higher performance.
|
| 47 |
+
|
| 48 |
+
Please note that each dataest was labeled by separate labs & seperate individuals, therefore while we map names
|
| 49 |
+
to a unified pose vocabulary, there will be annotator bias in keypoint placement (See Ye et al. 2023 for our Supplementary Note on annotator bias).
|
| 50 |
+
You will also note the dataset is highly diverse across species, but collectively has more representation of domesticated animals like dogs, cats, horses, and cattle.
|
| 51 |
+
We recommend if performance is not as good as you need it to be, first try video adaptation (see Ye et al. 2023),
|
| 52 |
+
or fine-tune these weights with your own labeling.
|
| 53 |
+
|
| 54 |
+
<p align="center">
|
| 55 |
+
<img src="https://images.squarespace-cdn.com/content/v1/57f6d51c9f74566f55ecf271/1690988780004-AG00N6OU1R21MZ0AU9RE/modelcard-SAQ.png?format=1500w" width="95%">
|
| 56 |
+
</p>
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
1. Prianka Banik, Lin Li, and Xishuang Dong. A novel dataset for keypoint detection of quadruped animals from images. ArXiv, abs/2108.13958, 2021
|
| 60 |
+
2. Jinkun Cao, Hongyang Tang, Haoshu Fang, Xiaoyong Shen, Cewu Lu, and Yu-Wing Tai. Cross-domain adaptation for animal pose estimation.
|
| 61 |
+
2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9497–9506, 2019.
|
| 62 |
+
3. Daniel Joska, Liam Clark, Naoya Muramatsu, Ricardo Jericevich, Fred Nicolls, Alexander Mathis, Mackenzie W. Mathis, and Amir Patel. Acinoset:
|
| 63 |
+
A 3d pose estimation dataset and baseline models for cheetahs in the wild. 2021 IEEE International Conference on Robotics and Automation
|
| 64 |
+
(ICRA), pages 13901–13908, 2021.
|
| 65 |
+
4. Alexander Mathis, Thomas Biasi, Steffen Schneider, Mert Yuksekgonul, Byron Rogers, Matthias Bethge, and Mackenzie W Mathis. Pretraining
|
| 66 |
+
boosts out-of-domain robustness for pose estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,
|
| 67 |
+
pages 1859–1868, 2021.
|
| 68 |
+
5. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for fine-grained image categorization. In First Workshop
|
| 69 |
+
on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011.
|
| 70 |
+
6. Benjamin Biggs, Thomas Roddick, Andrew Fitzgibbon, and Roberto Cipolla. Creatures great and smal: Recovering the shape and motion of
|
| 71 |
+
animals from video. In Asian Conference on Computer Vision, pages 3–19. Springer, 2018.
|
| 72 |
+
7. Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, and Dacheng Tao. Ap-10k: A benchmark for animal pose estimation in the wild. In Thirty-fifth
|
| 73 |
+
Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
|
| 74 |
+
8. iNaturalist. OGBIF Occurrence Download. https://doi.org/10.15468/dl.p7nbxt. iNaturalist, July 2020
|
| 75 |
+
9. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer
|
| 76 |
+
vision, pages 2961–2969, 2017.
|
| 77 |
+
10. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection, 2016.
|
| 78 |
+
11. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll’ar,
|
| 79 |
+
and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014
|