dirtmaxim commited on
Commit
e33b847
·
verified ·
1 Parent(s): 2fda725

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -3
README.md CHANGED
@@ -1,3 +1,136 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - imageomics/BaboonLand
5
+ language:
6
+ - en
7
+ tags:
8
+ - biology
9
+ - CV
10
+ - images
11
+ - animals
12
+ - zebra
13
+ - giraffe
14
+ - behavior
15
+ - behavior recognition
16
+ - annotation
17
+ - UAV
18
+ - drone
19
+ - video
20
+ ---
21
+ model_description: "Behavior recognition model for in situ drone videos of baboons, built using X3D model. It is trained on the BaboonLand mini-scene dataset, which is comprised of 20 hours of aerial video footage of baboons captured using a DJI Mavic 2S drone."
22
+ ---
23
+
24
+ # Model Card for X3D-KABR-Kinetics
25
+ x3d-BaboonLand is a behavior recognition model for in situ drone videos of zbaboons,
26
+ built using X3D model.
27
+ It is trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset.
28
+ It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert
29
+ behavioral ecologist.
30
+
31
+ ## Model Details
32
+
33
+ ### Model Description
34
+
35
+ - **Developed by:** Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Daniel Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline, Michelle Ramirez, Charles Stewart
36
+
37
+ - **Model type:** X3D-L
38
+ - **License:** MIT
39
+ - **Fine-tuned from model:** [X3D-L](https://github.com/facebookresearch/SlowFast/blob/main/configs/Kinetics/X3D_L.yaml)
40
+
41
+ This model was developed for the benefit of the community as an open-source product, thus we request that any derivative products are also open-source.
42
+
43
+ ### Model Sources
44
+
45
+ - **Repository:** [Project Repo](https://github.com/Imageomics/kabr-tools)
46
+ - **Paper:** [Paper Link](https://link.springer.com/article/10.1007/s11263-025-02493-5)
47
+ - **Project Page:** [BaboonLand Project Page](https://baboonland.xyz)
48
+
49
+ ## Uses
50
+
51
+ Baboon behavior recognition form in situ drone videos.
52
+
53
+ ### Out-of-Scope Use
54
+
55
+ This model was trained to detect and classify behavior from drone videos of baboons in Kenya. It may not perform well on other species or settings.
56
+
57
+
58
+ ## How to Get Started with the Model
59
+
60
+ Please see the illustrative examples in the [kabr-tools docs](https://imageomics.github.io/kabr-tools/)
61
+ for more information on how this model can be used.
62
+
63
+ ## Training Details
64
+
65
+ We include the configuration file ([config.yml](https://huggingface.co/imageomics/x3d-BaboonLand/blob/main/config.yml)) utilized by SlowFast for X3D model training.
66
+
67
+ ### Training Data
68
+
69
+ This model was trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset.
70
+
71
+ #### Training Hyperparameters
72
+
73
+ The model was trained for 120 epochs, using a batch size of 5.
74
+ We used the EQL loss function to address the long-tailed class distribution and SGD optimizer with a learning rate of 1e5.
75
+ We used a sample rate of 16x5, and random weight initialization.
76
+
77
+ ## Evaluation
78
+
79
+ The dataset was evaluated on the X3D-L model utilizing the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically utilizing the [test_net script](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py).
80
+
81
+ ### Testing Data
82
+
83
+ We provide a train-test split of the mini-scenes from the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) for evaluation purposes, with 75% for train and 25% for testing. No mini-scene was divided by the split.
84
+
85
+ #### Metrics
86
+
87
+ We report Top-1, Top-3, and Top-5 macro-scores. For full details, please refer to the [paper](https://link.springer.com/article/10.1007/s11263-025-02493-5).
88
+
89
+ **Results**
90
+
91
+ | WI | BS | Top-1 | Top-3 | Top-5 |
92
+ |----------|----|----------|----------|----------|
93
+ | Random | 5 | **30.04** | **60.58**| **72.13**|
94
+
95
+
96
+ ### Model Architecture and Objective
97
+
98
+ Please see the [Base Model Description](https://arxiv.org/pdf/2004.04730).
99
+
100
+ #### Hardware
101
+
102
+ Running the X3D model requires a modern NVIDIA GPU with CUDA support. X3D-L is designed to be computationally efficient, and requires 10–16 GB of GPU memory during training.
103
+
104
+ ## Citation
105
+
106
+ **BibTeX:**
107
+
108
+ If you use our model in your work, please cite our paper.
109
+
110
+ **Paper**
111
+ ```
112
+ @article{duporge2025baboonland,
113
+ title={BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos},
114
+ author={Duporge, Isla and Kholiavchenko, Maksim and Harel, Roi and Wolf, Scott and Rubenstein, Daniel I and Crofoot, Margaret C and Berger-Wolf, Tanya and Lee, Stephen J and Barreau, Julie and Kline, Jenna and Ramirez, Michelle and Stewart, Charles},
115
+ journal={International Journal of Computer Vision},
116
+ pages={1--12},
117
+ year={2025},
118
+ publisher={Springer}
119
+ }
120
+ ```
121
+
122
+
123
+ ## Acknowledgements
124
+
125
+ This work was supported by the [Imageomics Institute](https://imageomics.org), which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Additional support was also provided by the [AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE)](https://icicle.osu.edu/), which is funded by the US National Science Foundation under [Award #2112606](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2112606). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
126
+
127
+ The data was gathered at the [Mpala Research Centre](https://mpala.org/) in Kenya, in accordance with Research License No. NACOSTI/P/22/18214. The data collection protocol adhered strictly to the guidelines set forth by the Institutional Animal Care and Use Committee under permission No. IACUC 1835F.
128
+
129
+
130
+ ## Model Card Authors
131
+
132
+ Maksim Kholiavchenko
133
+
134
+ ## Model Card Contact
135
+
136
+ For questions on this model, please open a [discussion](https://huggingface.co/imageomics/x3d-BaboonLand/discussions) on this repo.