alohajazz commited on
Commit
71aefad
·
verified ·
1 Parent(s): 6b4d438

Update paper title to current Sci Rep revision; remove internal version numbers

Browse files

Title now matches manuscript: "Discovery and promotion of unknown sounds into operational detection targets for underwater passive acoustic monitoring under false alarm constraints". Revision label simplified to "revision in review" since this is the first author-side revision at Scientific Reports.

Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -25,21 +25,22 @@ via Domain-Adaptive Pretraining (DAPT) on a 5,673-h global ocean
25
  soundscape corpus (World-DAPT).
26
 
27
  This model serves as the "ears" for underwater soundscapes described in our
28
- paper: **"A stethoscope for the ocean: Open-world discovery in underwater
29
- soundscapes"** (*Scientific Reports*, **revision 2.2** in review).
 
30
 
31
- > **About revision 2.2** (May 2026). The original December 2026 release used
32
  > SimCLR/InfoNCE-based DAPT under AMP fp16, which suffered a numerical
33
  > instability that prevented BEATs encoder weight updates (the
34
  > `beats_dapt_topup_encoder.pt` weights were therefore byte-identical to
35
- > Microsoft's BEATs AS-2M PRETRAIN). Revision 2.2 corrects this by re-running
36
- > DAPT with **Masked Audio Modeling (MAM)** and a **k-means k=1024 tokeniser**
37
- > under bfloat16 precision on a larger 5,673-h World-DAPT corpus. The
38
- > superseded buggy weights have been **removed** from this repository (see
39
- > *Reproducibility of the original buggy state* below for how to recreate
40
- > them if needed).
41
 
42
- ## Model Details (revision 2.2 canonical)
43
 
44
  - **Architecture:** BEATs (Audio Transformer; Microsoft)
45
  - **Self-supervised pretraining:** Masked Audio Modeling (MAM) with k=1024
@@ -52,7 +53,7 @@ soundscapes"** (*Scientific Reports*, **revision 2.2** in review).
52
  - **Input:** 16 kHz mono waveform
53
  - **Backbone init:** BEATs AS-2M (iter3+)
54
 
55
- ## Available files (revision 2.2 canonical)
56
 
57
  | File | SHA-256 | Size |
58
  |---|---|---|
@@ -66,7 +67,7 @@ above encoder. Single-seed Event F1 = 0.483; n=10 mean ± std = 0.475 ± 0.017
66
  ## Reproducibility of the original (buggy) state
67
 
68
  The original December 2026 release contained two files that have been
69
- removed in revision 2.2:
70
 
71
  | Removed file | Replacement / how to recreate |
72
  |---|---|
@@ -83,12 +84,11 @@ prevented any weight updates.
83
  These weights are designed to be used with the official code repository:
84
 
85
  **GitHub Repository:** [alohajazz/openworld-soundscape-cced2-dgpu](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu)
86
- (see branch `revision-2.2-restructure` until merged into `main`)
87
 
88
  ```python
89
  from huggingface_hub import hf_hub_download
90
 
91
- # Download canonical revision-2.2 weights
92
  encoder_path = hf_hub_download(
93
  repo_id="BiologgingSolutions/OceanBEATs",
94
  filename="beats_dapt_mam_step120000.pt",
@@ -129,11 +129,11 @@ the released weights does not grant any rights under those patents.
129
  If you use this model in your research, please cite our paper:
130
 
131
  ```bibtex
132
- @article{noda2026stethoscope,
133
- title={A stethoscope for the ocean: Open-world discovery in underwater soundscapes},
134
- author={Noda, Takuji and Koizumi, Takuya and others},
135
  journal={Scientific Reports},
136
- note={Revision 2.2, in review},
137
  year={2026}
138
  }
139
  ```
@@ -147,8 +147,8 @@ ignored `center_sec`, returning per-file constant embeddings) was discovered
147
  and fixed on 2026-05-08. The fix affects only the **extraction code** in the
148
  GitHub repository — **encoder weights in this repository are byte-identical
149
  before and after the fix** (the bug occurred downstream of the encoder
150
- forward pass). All revision-2.2 result tables (Tables 2/3/4 and Fig 3) were
151
- re-computed with the corrected window-aware extractor; updated paper
152
  artifacts are tracked under
153
  [`paper_artifacts/winaware_2026-05-09/`](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu/tree/main/paper_artifacts/winaware_2026-05-09)
154
  and
@@ -158,9 +158,9 @@ strict 0–8 kHz in-band consistency (Nyquist of the 16-kHz BEATs input);
158
  species whose dominant call energy lies above 8 kHz are listed in the
159
  GitHub `REVISION2.md`. SHA-256 fingerprints of
160
  `beats_dapt_mam_step120000.pt` and `sed_head_56_fulldata_ep8.pt` are
161
- unchanged from the revision-2.2 release listed in the table above.
162
 
163
- ### Revision 2.2 (May 2026)
164
  - DAPT method changed from SimCLR/InfoNCE to Masked Audio Modeling (MAM)
165
  with k-means k=1024 tokeniser; precision changed from AMP fp16 to bfloat16
166
  (corrects the original numerical instability that prevented weight updates)
 
25
  soundscape corpus (World-DAPT).
26
 
27
  This model serves as the "ears" for underwater soundscapes described in our
28
+ paper: **"Discovery and promotion of unknown sounds into operational
29
+ detection targets for underwater passive acoustic monitoring under false
30
+ alarm constraints"** (*Scientific Reports*, **revision** in review).
31
 
32
+ > **About this revision** (May 2026). The original December 2026 release used
33
  > SimCLR/InfoNCE-based DAPT under AMP fp16, which suffered a numerical
34
  > instability that prevented BEATs encoder weight updates (the
35
  > `beats_dapt_topup_encoder.pt` weights were therefore byte-identical to
36
+ > Microsoft's BEATs AS-2M PRETRAIN). The current revision corrects this by
37
+ > re-running DAPT with **Masked Audio Modeling (MAM)** and a **k-means
38
+ > k=1024 tokeniser** under bfloat16 precision on a larger 5,673-h
39
+ > World-DAPT corpus. The superseded buggy weights have been **removed**
40
+ > from this repository (see *Reproducibility of the original buggy state*
41
+ > below for how to recreate them if needed).
42
 
43
+ ## Model Details (current canonical revision)
44
 
45
  - **Architecture:** BEATs (Audio Transformer; Microsoft)
46
  - **Self-supervised pretraining:** Masked Audio Modeling (MAM) with k=1024
 
53
  - **Input:** 16 kHz mono waveform
54
  - **Backbone init:** BEATs AS-2M (iter3+)
55
 
56
+ ## Available files (current canonical revision)
57
 
58
  | File | SHA-256 | Size |
59
  |---|---|---|
 
67
  ## Reproducibility of the original (buggy) state
68
 
69
  The original December 2026 release contained two files that have been
70
+ removed in the current revision:
71
 
72
  | Removed file | Replacement / how to recreate |
73
  |---|---|
 
84
  These weights are designed to be used with the official code repository:
85
 
86
  **GitHub Repository:** [alohajazz/openworld-soundscape-cced2-dgpu](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu)
 
87
 
88
  ```python
89
  from huggingface_hub import hf_hub_download
90
 
91
+ # Download canonical revision weights
92
  encoder_path = hf_hub_download(
93
  repo_id="BiologgingSolutions/OceanBEATs",
94
  filename="beats_dapt_mam_step120000.pt",
 
129
  If you use this model in your research, please cite our paper:
130
 
131
  ```bibtex
132
+ @article{noda2026discovery,
133
+ title={Discovery and promotion of unknown sounds into operational detection targets for underwater passive acoustic monitoring under false alarm constraints},
134
+ author={Noda, Takuji and Koizumi, Takuya},
135
  journal={Scientific Reports},
136
+ note={Revision, in review},
137
  year={2026}
138
  }
139
  ```
 
147
  and fixed on 2026-05-08. The fix affects only the **extraction code** in the
148
  GitHub repository — **encoder weights in this repository are byte-identical
149
  before and after the fix** (the bug occurred downstream of the encoder
150
+ forward pass). All current-revision result tables (Tables 2/3/4 and Fig 3)
151
+ were re-computed with the corrected window-aware extractor; updated paper
152
  artifacts are tracked under
153
  [`paper_artifacts/winaware_2026-05-09/`](https://github.com/alohajazz/openworld-soundscape-cced2-dgpu/tree/main/paper_artifacts/winaware_2026-05-09)
154
  and
 
158
  species whose dominant call energy lies above 8 kHz are listed in the
159
  GitHub `REVISION2.md`. SHA-256 fingerprints of
160
  `beats_dapt_mam_step120000.pt` and `sed_head_56_fulldata_ep8.pt` are
161
+ unchanged from the current revision listed in the table above.
162
 
163
+ ### Current revision (May 2026)
164
  - DAPT method changed from SimCLR/InfoNCE to Masked Audio Modeling (MAM)
165
  with k-means k=1024 tokeniser; precision changed from AMP fp16 to bfloat16
166
  (corrects the original numerical instability that prevented weight updates)