|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
tags: |
|
|
- audio |
|
|
- sound-event-detection |
|
|
- underwater-acoustics |
|
|
- self-supervised-learning |
|
|
- beats |
|
|
datasets: |
|
|
- world-dapt |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- f1 |
|
|
library_name: torch |
|
|
pipeline_tag: audio-classification |
|
|
--- |
|
|
|
|
|
# OceanBEATs |
|
|
|
|
|
**OceanBEATs** is a foundation model for underwater acoustic monitoring, adapted from [BEATs](https://github.com/microsoft/unilm/tree/master/beats) via Domain-Adaptive Pretraining (DAPT) on approximately 4,400 hours of global ocean soundscapes (World-DAPT corpus). |
|
|
|
|
|
This model serves as the "ears" for underwater soundscapes described in our paper: **"A stethoscope for the ocean: Unknownness-aware monitoring under false-positives-per-hour constraints in underwater soundscapes"** (Under Review, *Scientific Reports*). |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model Type:** Audio Transformer (BEATs architecture) |
|
|
- **Pretraining:** Masked Audio Modeling + DAPT (SimCLR/InfoNCE) on underwater data |
|
|
- **Input:** 16kHz mono audio waveform |
|
|
- **Backbone:** BEATs AS-2M (iter3+) |
|
|
|
|
|
## Available Files |
|
|
|
|
|
This repository hosts the pretrained weights required to reproduce the results in our paper. |
|
|
|
|
|
1. **`beats_dapt_topup_encoder.pt`** |
|
|
* The core encoder (backbone) adapted to underwater acoustics. |
|
|
* Use this for feature extraction, unknown detection (CCED2), or fine-tuning on new marine datasets. |
|
|
|
|
|
2. **`sed_head_56_topup_ep8.pt`** |
|
|
* A 56-class Sound Event Detection (SED) head trained on coastal/lagoon data (Okinawa, Japan). |
|
|
* Detects fish, mammals, vessels, and environmental sounds. |
|
|
|
|
|
## Usage |
|
|
|
|
|
These weights are designed to be used with the official code repository: |
|
|
|
|
|
**GitHub Repository:** [alohajazz/openworld-soundscape-cced2-dgpu](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu) |
|
|
|
|
|
Please download the `.pt` files and place them in the `weights/` directory of the cloned GitHub repository. |
|
|
|
|
|
```bash |
|
|
# Example directory structure after download |
|
|
openworld-soundscape-cced2-dgpu/ |
|
|
βββ weights/ |
|
|
βββ beats_dapt_topup_encoder.pt |
|
|
βββ sed_head_56_topup_ep8.pt |
|
|
βββ cced2/ ... |
|
|
``` |
|
|
|
|
|
## License & Data Availability |
|
|
|
|
|
**License:** CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International) |
|
|
|
|
|
These weights are released for **non-commercial research purposes only**. |
|
|
Commercial use is strictly prohibited without prior permission from the authors. |
|
|
|
|
|
> **Note:** The source code for using these models is released under the **MIT License** at the GitHub repository linked above. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite our paper: |
|
|
|
|
|
```bibtex |
|
|
@article{noda2026stethoscope, |
|
|
title={A stethoscope for the ocean: Unknownness-aware monitoring under false-positives-per-hour constraints in underwater soundscapes}, |
|
|
author={Noda, Takuji and Koizumi, Takuya}, |
|
|
journal={Scientific Reports}, |
|
|
note={Under Review}, |
|
|
year={2026} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgements |
|
|
|
|
|
The base model architecture is based on BEATs (Microsoft). We acknowledge the creators of the BEATs model and the various open-source ocean acoustic datasets (SanctSound, ONC, PALAOA, etc.) used for DAPT. |