shikhar7ssu nielsr HF Staff commited on
Commit
6fc9465
·
verified ·
1 Parent(s): 4809adc

Add paper link and metadata for ESPnet (#1)

Browse files

- Add paper link and metadata for ESPnet (944e30c33e21b8d95bf514a8ecf45dd3d066e05b)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +16 -10
README.md CHANGED
@@ -1,13 +1,25 @@
1
  ---
 
 
 
 
 
2
  tags:
3
  - espnet
4
  - audio
5
  - self-supervised-learning
6
- datasets:
7
- - as2m
8
- license: cc-by-4.0
9
  ---
10
 
 
 
 
 
 
 
 
 
 
 
11
  ## ESPnet2 SSL model
12
 
13
  ### `shikhar7ssu/OpenBEATs-ICME`
@@ -1241,12 +1253,6 @@ distributed: true
1241
  doi={10.21437/Interspeech.2018-1456},
1242
  url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
1243
  }
1244
-
1245
-
1246
-
1247
-
1248
-
1249
-
1250
  ```
1251
 
1252
  or arXiv:
@@ -1260,4 +1266,4 @@ or arXiv:
1260
  archivePrefix={arXiv},
1261
  primaryClass={cs.CL}
1262
  }
1263
- ```
 
1
  ---
2
+ datasets:
3
+ - as2m
4
+ license: cc-by-4.0
5
+ library_name: espnet
6
+ pipeline_tag: audio-classification
7
  tags:
8
  - espnet
9
  - audio
10
  - self-supervised-learning
 
 
 
11
  ---
12
 
13
+ # OpenBEATs-ICME
14
+
15
+ This repository contains the audio encoder model presented in the paper [The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge](https://huggingface.co/papers/2601.16273).
16
+
17
+ ## Model Description
18
+ The system is built on BEATs, a masked speech token prediction-based audio encoder. This version scales the architecture up to 300 million parameters and was pre-trained using 74,000 hours of audio data derived from various speech, music, and sound corpora.
19
+
20
+ - **Code:** [ESPnet GitHub](https://github.com/espnet/espnet/)
21
+ - **Paper:** [The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge](https://huggingface.co/papers/2601.16273)
22
+
23
  ## ESPnet2 SSL model
24
 
25
  ### `shikhar7ssu/OpenBEATs-ICME`
 
1253
  doi={10.21437/Interspeech.2018-1456},
1254
  url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
1255
  }
 
 
 
 
 
 
1256
  ```
1257
 
1258
  or arXiv:
 
1266
  archivePrefix={arXiv},
1267
  primaryClass={cs.CL}
1268
  }
1269
+ ```