Add paper link and metadata for ESPnet (#1)
Browse files- Add paper link and metadata for ESPnet (944e30c33e21b8d95bf514a8ecf45dd3d066e05b)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,13 +1,25 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
- espnet
|
| 4 |
- audio
|
| 5 |
- self-supervised-learning
|
| 6 |
-
datasets:
|
| 7 |
-
- as2m
|
| 8 |
-
license: cc-by-4.0
|
| 9 |
---
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
## ESPnet2 SSL model
|
| 12 |
|
| 13 |
### `shikhar7ssu/OpenBEATs-ICME`
|
|
@@ -1241,12 +1253,6 @@ distributed: true
|
|
| 1241 |
doi={10.21437/Interspeech.2018-1456},
|
| 1242 |
url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
|
| 1243 |
}
|
| 1244 |
-
|
| 1245 |
-
|
| 1246 |
-
|
| 1247 |
-
|
| 1248 |
-
|
| 1249 |
-
|
| 1250 |
```
|
| 1251 |
|
| 1252 |
or arXiv:
|
|
@@ -1260,4 +1266,4 @@ or arXiv:
|
|
| 1260 |
archivePrefix={arXiv},
|
| 1261 |
primaryClass={cs.CL}
|
| 1262 |
}
|
| 1263 |
-
```
|
|
|
|
| 1 |
---
|
| 2 |
+
datasets:
|
| 3 |
+
- as2m
|
| 4 |
+
license: cc-by-4.0
|
| 5 |
+
library_name: espnet
|
| 6 |
+
pipeline_tag: audio-classification
|
| 7 |
tags:
|
| 8 |
- espnet
|
| 9 |
- audio
|
| 10 |
- self-supervised-learning
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# OpenBEATs-ICME
|
| 14 |
+
|
| 15 |
+
This repository contains the audio encoder model presented in the paper [The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge](https://huggingface.co/papers/2601.16273).
|
| 16 |
+
|
| 17 |
+
## Model Description
|
| 18 |
+
The system is built on BEATs, a masked speech token prediction-based audio encoder. This version scales the architecture up to 300 million parameters and was pre-trained using 74,000 hours of audio data derived from various speech, music, and sound corpora.
|
| 19 |
+
|
| 20 |
+
- **Code:** [ESPnet GitHub](https://github.com/espnet/espnet/)
|
| 21 |
+
- **Paper:** [The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge](https://huggingface.co/papers/2601.16273)
|
| 22 |
+
|
| 23 |
## ESPnet2 SSL model
|
| 24 |
|
| 25 |
### `shikhar7ssu/OpenBEATs-ICME`
|
|
|
|
| 1253 |
doi={10.21437/Interspeech.2018-1456},
|
| 1254 |
url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
|
| 1255 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1256 |
```
|
| 1257 |
|
| 1258 |
or arXiv:
|
|
|
|
| 1266 |
archivePrefix={arXiv},
|
| 1267 |
primaryClass={cs.CL}
|
| 1268 |
}
|
| 1269 |
+
```
|