File size: 3,074 Bytes
f46fe95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
743acb5
f46fe95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1670c14
f46fe95
 
 
 
 
5f5d129
f46fe95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
743acb5
 
8c7a522
6e9c678
f46fe95
6e9c678
f46fe95
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
license: cc-by-nc-4.0
tags:
- audio
- sound-event-detection
- underwater-acoustics
- self-supervised-learning
- beats
datasets:
- world-dapt
language:
- en
metrics:
- f1
library_name: torch
pipeline_tag: audio-classification
---

# OceanBEATs

**OceanBEATs** is a foundation model for underwater acoustic monitoring, adapted from [BEATs](https://github.com/microsoft/unilm/tree/master/beats) via Domain-Adaptive Pretraining (DAPT) on approximately 4,400 hours of global ocean soundscapes (World-DAPT corpus).

This model serves as the "ears" for underwater soundscapes described in our paper: **"A stethoscope for the ocean: Unknownness-aware monitoring under false-positives-per-hour constraints in underwater soundscapes"** (Under Review, *Scientific Reports*).

## Model Details

- **Model Type:** Audio Transformer (BEATs architecture)
- **Pretraining:** Masked Audio Modeling + DAPT (SimCLR/InfoNCE) on underwater data
- **Input:** 16kHz mono audio waveform
- **Backbone:** BEATs AS-2M (iter3+)

## Available Files

This repository hosts the pretrained weights required to reproduce the results in our paper.

1. **`beats_dapt_topup_encoder.pt`**
    * The core encoder (backbone) adapted to underwater acoustics.
    * Use this for feature extraction, unknown detection (CCED2), or fine-tuning on new marine datasets.

2. **`sed_head_56_topup_ep8.pt`**
    * A 56-class Sound Event Detection (SED) head trained on coastal/lagoon data (Okinawa, Japan).
    * Detects fish, mammals, vessels, and environmental sounds.

## Usage

These weights are designed to be used with the official code repository:

**GitHub Repository:** [alohajazz/openworld-soundscape-cced2-dgpu](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu)

Please download the `.pt` files and place them in the `weights/` directory of the cloned GitHub repository.

```bash
# Example directory structure after download
openworld-soundscape-cced2-dgpu/
└── weights/
    β”œβ”€β”€ beats_dapt_topup_encoder.pt
    β”œβ”€β”€ sed_head_56_topup_ep8.pt
    └── cced2/ ...
```

## License & Data Availability

**License:** CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International)

These weights are released for **non-commercial research purposes only**.
Commercial use is strictly prohibited without prior permission from the authors.

> **Note:** The source code for using these models is released under the **MIT License** at the GitHub repository linked above.

## Citation

If you use this model in your research, please cite our paper:

```bibtex
@article{noda2026stethoscope,
  title={A stethoscope for the ocean: Unknownness-aware monitoring under false-positives-per-hour constraints in underwater soundscapes},
  author={Noda, Takuji and Koizumi, Takuya},
  journal={Scientific Reports},
  note={Under Review},
  year={2026}
}
```

## Acknowledgements

The base model architecture is based on BEATs (Microsoft). We acknowledge the creators of the BEATs model and the various open-source ocean acoustic datasets (SanctSound, ONC, PALAOA, etc.) used for DAPT.