imageomics
/

x3d-BaboonLand

+---
+license: mit
+datasets:
+- imageomics/BaboonLand
+language:
+- en
+tags:
+- biology
+- CV
+- images
+- animals
+- zebra
+- giraffe
+- behavior
+- behavior recognition
+- annotation
+- UAV
+- drone
+- video
+---
+model_description: "Behavior recognition model for in situ drone videos of baboons, built using X3D model. It is trained on the BaboonLand mini-scene dataset, which is comprised of 20 hours of aerial video footage of baboons captured using a DJI Mavic 2S drone."
+---
+# Model Card for X3D-KABR-Kinetics
+x3d-BaboonLand is a behavior recognition model for in situ drone videos of zbaboons,
+built using X3D model.
+It is trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset.
+It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert
+behavioral ecologist.
+## Model Details
+### Model Description
+- **Developed by:** Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Daniel Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline, Michelle Ramirez, Charles Stewart
+- **Model type:** X3D-L
+- **License:** MIT
+- **Fine-tuned from model:** [X3D-L](https://github.com/facebookresearch/SlowFast/blob/main/configs/Kinetics/X3D_L.yaml)
+This model was developed for the benefit of the community as an open-source product, thus we request that any derivative products are also open-source.
+### Model Sources
+- **Repository:** [Project Repo](https://github.com/Imageomics/kabr-tools)
+- **Paper:** [Paper Link](https://link.springer.com/article/10.1007/s11263-025-02493-5)
+- **Project Page:** [BaboonLand Project Page](https://baboonland.xyz)
+## Uses
+Baboon behavior recognition form in situ drone videos.
+### Out-of-Scope Use
+This model was trained to detect and classify behavior from drone videos of baboons in Kenya. It may not perform well on other species or settings.
+## How to Get Started with the Model
+Please see the illustrative examples in the [kabr-tools docs](https://imageomics.github.io/kabr-tools/)
+for more information on how this model can be used.
+## Training Details
+We include the configuration file ([config.yml](https://huggingface.co/imageomics/x3d-BaboonLand/blob/main/config.yml)) utilized by SlowFast for X3D model training.
+### Training Data
+This model was trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset.
+#### Training Hyperparameters
+The model was trained for 120 epochs, using a batch size of 5.
+We used the EQL loss function to address the long-tailed class distribution and SGD optimizer with a learning rate of 1e5.
+We used a sample rate of 16x5, and random weight initialization.
+## Evaluation
+The dataset was evaluated on the X3D-L model utilizing the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically utilizing the [test_net script](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py).
+### Testing Data
+We provide a train-test split of the mini-scenes from the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) for evaluation purposes, with 75% for train and 25% for testing. No mini-scene was divided by the split.
+#### Metrics
+We report Top-1, Top-3, and Top-5 macro-scores. For full details, please refer to the [paper](https://link.springer.com/article/10.1007/s11263-025-02493-5).
+**Results**
+| WI       | BS | Top-1 | Top-3 | Top-5 |
+|----------|----|----------|----------|----------|
+|  Random  | 5 | **30.04**   | **60.58**| **72.13**|
+### Model Architecture and Objective
+Please see the [Base Model Description](https://arxiv.org/pdf/2004.04730).
+#### Hardware
+Running the X3D model requires a modern NVIDIA GPU with CUDA support. X3D-L is designed to be computationally efficient, and requires 10–16 GB of GPU memory during training.
+## Citation
+**BibTeX:**
+If you use our model in your work, please cite our paper.
+**Paper**
+```
+@article{duporge2025baboonland,
+  title={BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos},
+  author={Duporge, Isla and Kholiavchenko, Maksim and Harel, Roi and Wolf, Scott and Rubenstein, Daniel I and Crofoot, Margaret C and Berger-Wolf, Tanya and Lee, Stephen J and Barreau, Julie and Kline, Jenna and Ramirez, Michelle and Stewart, Charles},
+  journal={International Journal of Computer Vision},
+  pages={1--12},
+  year={2025},
+  publisher={Springer}
+}
+```
+## Acknowledgements
+This work was supported by the [Imageomics Institute](https://imageomics.org), which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Additional support was also provided by the [AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE)](https://icicle.osu.edu/), which is funded by the US National Science Foundation under [Award #2112606](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2112606). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
+The data was gathered at the [Mpala Research Centre](https://mpala.org/) in Kenya, in accordance with Research License No. NACOSTI/P/22/18214. The data collection protocol adhered strictly to the guidelines set forth by the Institutional Animal Care and Use Committee under permission No. IACUC 1835F.
+## Model Card Authors
+Maksim Kholiavchenko
+## Model Card Contact
+For questions on this model, please open a [discussion](https://huggingface.co/imageomics/x3d-BaboonLand/discussions) on this repo.