Automatic Speech Recognition
ESPnet
multilingual
audio
phone-recognition
grapheme-to-phoneme
phoneme-to-grapheme
Instructions to use espnet/powsm_ctc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/powsm_ctc with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/powsm_ctc" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -32,9 +32,12 @@ pipeline_tag: automatic-speech-recognition
|
|
| 32 |
POWSM-CTC is a variant of [POWSM](https://huggingface.co/espnet/powsm), the first phonetic foundation model that can perform four phone-related tasks.
|
| 33 |
Its multi-task encoder-CTC structure is based on [OWSM-CTC](https://aclanthology.org/2024.acl-long.549/), and trained on [IPAPack++](https://huggingface.co/anyspeech), the same dataset as POWSM.
|
| 34 |
|
| 35 |
-
|
| 36 |
Its decoding is much faster than encoder-decoder models, with similar or enhanced PR performance on unseen domain.
|
| 37 |
|
|
|
|
|
|
|
|
|
|
| 38 |
To use the pre-trained model, please install `espnet` and `espnet_model_zoo`. The requirements are:
|
| 39 |
```
|
| 40 |
torch
|
|
|
|
| 32 |
POWSM-CTC is a variant of [POWSM](https://huggingface.co/espnet/powsm), the first phonetic foundation model that can perform four phone-related tasks.
|
| 33 |
Its multi-task encoder-CTC structure is based on [OWSM-CTC](https://aclanthology.org/2024.acl-long.549/), and trained on [IPAPack++](https://huggingface.co/anyspeech), the same dataset as POWSM.
|
| 34 |
|
| 35 |
+
This model is proposed together with our paper [PRiSM](https://arxiv.org/abs/2601.14046), the first open-source benchmark for phone recognition systems.
|
| 36 |
Its decoding is much faster than encoder-decoder models, with similar or enhanced PR performance on unseen domain.
|
| 37 |
|
| 38 |
+
> [!TIP]
|
| 39 |
+
> Check out POWSM-CTC's predecessor: [🐁POWSM](https://huggingface.co/espnet/powsm)
|
| 40 |
+
|
| 41 |
To use the pre-trained model, please install `espnet` and `espnet_model_zoo`. The requirements are:
|
| 42 |
```
|
| 43 |
torch
|