Automatic Speech Recognition
ESPnet
multilingual
audio
phone-recognition
grapheme-to-phoneme
phoneme-to-grapheme
Instructions to use espnet/powsm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/powsm with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/powsm" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
update README about textnorm_retrained
Browse files
README.md
CHANGED
|
@@ -35,6 +35,9 @@ espnet_model_zoo
|
|
| 35 |
|
| 36 |
**The recipe can be found in ESPnet:** https://github.com/espnet/espnet/tree/master/egs2/powsm/s2t1
|
| 37 |
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
### Example script for PR/ASR/G2P/P2G
|
| 40 |
|
|
|
|
| 35 |
|
| 36 |
**The recipe can be found in ESPnet:** https://github.com/espnet/espnet/tree/master/egs2/powsm/s2t1
|
| 37 |
|
| 38 |
+
> [!NOTE]
|
| 39 |
+
> Jan 2026: We release a retrained version with improved ASR text normalization.
|
| 40 |
+
> It is located in the subfolder `textnorm_retrained` and has the same structure as the main model.
|
| 41 |
|
| 42 |
### Example script for PR/ASR/G2P/P2G
|
| 43 |
|