Update README.md
Browse files
README.md
CHANGED
|
@@ -17,11 +17,15 @@ tags:
|
|
| 17 |
pipeline_tag: automatic-speech-recognition
|
| 18 |
---
|
| 19 |
|
| 20 |
-
[
|
| 21 |
-
It
|
|
|
|
| 22 |
|
| 23 |
-
[OWSM-CTC
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
To use the pre-trained model, please install `espnet` and `espnet_model_zoo`. The requirements are:
|
| 27 |
```
|
|
@@ -31,7 +35,7 @@ espnet
|
|
| 31 |
espnet_model_zoo
|
| 32 |
```
|
| 33 |
|
| 34 |
-
**The recipe can be found in ESPnet:** https://github.com/espnet/espnet/tree/master/egs2/
|
| 35 |
|
| 36 |
### Example script for batched inference
|
| 37 |
|
|
|
|
| 17 |
pipeline_tag: automatic-speech-recognition
|
| 18 |
---
|
| 19 |
|
| 20 |
+
[Open Whisper-style Speech Model (OWSM)](https://www.wavlab.org/activities/2024/owsm/) is the first **fully open** Whisper-style speech foundation model.
|
| 21 |
+
It reproduces and advances OpenAI's Whisper-style training using publicly available data and open-source toolkits.
|
| 22 |
+
The code, pre-trained model weights, and training logs are publicly released to promote open science in speech foundation models.
|
| 23 |
|
| 24 |
+
[OWSM-CTC](https://aclanthology.org/2024.acl-long.549/) (Peng et al., ACL 2024) is a novel encoder-only speech foundation model based on hierarchical multi-task self-conditioned CTC.
|
| 25 |
+
It supports multilingual speech recognition, speech translation, and language identification within a single non-autoregressive model.
|
| 26 |
+
|
| 27 |
+
[OWSM-CTC v4](https://www.isca-archive.org/interspeech_2025/peng25c_interspeech.html) is trained for three epochs on 320k hours of public audio data covering multilingual speech recognition, any-to-any speech translation, and language identification.
|
| 28 |
+
The newly curated data are publicly released: https://huggingface.co/datasets/espnet/yodas_owsmv4
|
| 29 |
|
| 30 |
To use the pre-trained model, please install `espnet` and `espnet_model_zoo`. The requirements are:
|
| 31 |
```
|
|
|
|
| 35 |
espnet_model_zoo
|
| 36 |
```
|
| 37 |
|
| 38 |
+
**The recipe can be found in ESPnet:** https://github.com/espnet/espnet/tree/master/egs2/owsm_ctc_v4/s2t1
|
| 39 |
|
| 40 |
### Example script for batched inference
|
| 41 |
|