BiologgingSolutions
/

OceanBEATs

+---
+license: cc-by-nc-4.0
+tags:
+- audio
+- sound-event-detection
+- underwater-acoustics
+- self-supervised-learning
+- nature-ai
+- beats
+datasets:
+- world-dapt
+language:
+- en
+metrics:
+- f1
+library_name: torch
+pipeline_tag: audio-classification
+---
+# OceanBEATs
+**OceanBEATs** is a foundation model for underwater acoustic monitoring, adapted from [BEATs](https://github.com/microsoft/unilm/tree/master/beats) via Domain-Adaptive Pretraining (DAPT) on approximately 4,400 hours of global ocean soundscapes (World-DAPT corpus).
+This model serves as the "ears" for embodied acoustic agents described in our paper: **"Embodied acoustic agents with self-supervised audio for unknown-aware underwater soundscapes under label and false-positive constraints"** (Under Review, *npj Artificial Intelligence*).
+## Model Details
+- **Model Type:** Audio Transformer (BEATs architecture)
+- **Pretraining:** Masked Audio Modeling + DAPT (SimCLR/InfoNCE) on underwater data
+- **Input:** 16kHz mono audio waveform
+- **Backbone:** BEATs AS-2M (iter3+)
+## Available Files
+This repository hosts the pretrained weights required to reproduce the results in our paper.
+1. **`beats_dapt_topup_encoder.pt`**
+    * The core encoder (backbone) adapted to underwater acoustics.
+    * Use this for feature extraction, unknown detection (CCED2), or fine-tuning on new marine datasets.
+2. **`sed_head_56_topup_ep8.pt`**
+    * A 56-class Sound Event Detection (SED) head trained on coastal/lagoon data (Okinawa, Japan).
+    * Detects fish, mammals, vessels, and environmental sounds.
+## Usage
+These weights are designed to be used with the official code repository:
+**GitHub Repository:** [BiologgingSolutions/embodied-ocean-cced2-dgpu](https://github.com/BiologgingSolutions/embodied-ocean-cced2-dgpu)
+Please download the `.pt` files and place them in the `weights/` directory of the cloned GitHub repository.
+```bash
+# Example directory structure after download
+embodied-ocean-cced2-dgpu/
+└── weights/
+    ├── beats_dapt_topup_encoder.pt
+    ├── sed_head_56_topup_ep8.pt
+    └── cced2/ ...
+```
+## License & Data Availability
+**License:** CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International)
+These weights are released for **non-commercial research purposes only**.
+Commercial use is strictly prohibited without prior permission from the authors.
+> **Note:** The source code for using these models is released under the **MIT License** at the GitHub repository linked above.
+## Citation
+If you use this model in your research, please cite our paper:
+```bibtex
+@article{noda2025embodied,
+  title={Embodied acoustic agents with self-supervised audio for unknown-aware underwater soundscapes under label and false-positive constraints},
+  author={Noda, Takuji and Koizumi, Takuya},
+  journal={npj Artificial Intelligence (Special Collection: Embodied AI)},
+  note={Under Review},
+  year={2025}
+}
+```
+## Acknowledgements
+The base model architecture is based on BEATs (Microsoft). We acknowledge the creators of the BEATs model and the various open-source ocean acoustic datasets (SanctSound, ONC, PALAOA, etc.) used for DAPT.