| title: "CTC-DRO MMS-based ASR model - set 2" | |
| language: multilingual | |
| tags: | |
| - asr | |
| - ctc-dro | |
| - MMS | |
| license: cc-by-nc-4.0 | |
| # CTC-Baseline MMS-based ASR model - set 2 | |
| This repository contains a CTC-Baseline MMS-based automatic speech recognition (ASR) model trained with ESPnet. | |
| The model was trained on balanced training data from set 2. | |
| ## Intended Use | |
| This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`): | |
| ```bash | |
| import soundfile as sf | |
| from espnet2.bin.asr_inference import Speech2Text | |
| asr_train_config = "ctc-baseline_mms_set_2/config.yaml" | |
| asr_model_file = "ctc-baseline_mms_set_2/valid.loss.best.pth" | |
| model = Speech2Text.from_pretrained( | |
| asr_train_config=asr_train_config, | |
| asr_model_file=asr_model_file | |
| ) | |
| speech, _ = sf.read("input.wav") | |
| text, *_ = model(speech)[0] | |
| print("Recognized text:", text) | |
| ``` | |
| ## How to Use | |
| 1. Clone this repository. | |
| 2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file. | |
| 3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths. | |