Automatic Speech Recognition
ESPnet
multilingual
audio
speech-translation
language-identification
Eval Results
Instructions to use espnet/owsm_ctc_v4_1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/owsm_ctc_v4_1B with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/owsm_ctc_v4_1B" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -192,7 +192,7 @@ print(segments)
|
|
| 192 |
@inproceedings{owsm-v4,
|
| 193 |
title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
|
| 194 |
author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
|
| 195 |
-
booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH)
|
| 196 |
year={2025},
|
| 197 |
}
|
| 198 |
```
|
|
|
|
| 192 |
@inproceedings{owsm-v4,
|
| 193 |
title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
|
| 194 |
author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
|
| 195 |
+
booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH)},
|
| 196 |
year={2025},
|
| 197 |
}
|
| 198 |
```
|