Automatic Speech Recognition
ESPnet
multilingual
audio
phone-recognition
grapheme-to-phoneme
phoneme-to-grapheme
Instructions to use espnet/powsm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/powsm with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/powsm" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -38,6 +38,7 @@ espnet_model_zoo
|
|
| 38 |
> [!NOTE]
|
| 39 |
> Jan 2026: We release a retrained version with improved ASR text normalization.
|
| 40 |
> It is located in the subfolder `textnorm_retrained` and has the same structure as the main model.
|
|
|
|
| 41 |
|
| 42 |
### Example script for PR/ASR/G2P/P2G
|
| 43 |
|
|
|
|
| 38 |
> [!NOTE]
|
| 39 |
> Jan 2026: We release a retrained version with improved ASR text normalization.
|
| 40 |
> It is located in the subfolder `textnorm_retrained` and has the same structure as the main model.
|
| 41 |
+
> Additional details are provided in the updated arXiv appendix.
|
| 42 |
|
| 43 |
### Example script for PR/ASR/G2P/P2G
|
| 44 |
|