| # Common Accent ASR Model | |
| This is a fine-tuned ASR model based on [espnet/owsm_v3.1_ebf_base](https://huggingface.co/espnet/owsm_v3.1_ebf_base) trained on the [DTU54DL/common-accent](https://huggingface.co/datasets/DTU54DL/common-accent) dataset. | |
| ## Model details | |
| - Base model: espnet/owsm_v3.1_ebf_base | |
| - Language: English | |
| - Task: Automatic Speech Recognition | |
| ## Usage | |
| ```python | |
| import torch | |
| import numpy as np | |
| from espnet2.bin.s2t_inference import Speech2Text | |
| # Load the model | |
| model = Speech2Text.from_pretrained( | |
| "reecursion/accent-adaptive-owsm_v3.1_ebf_base", | |
| lang_sym="<eng>", | |
| beam_size=1, | |
| device="cuda" if torch.cuda.is_available() else "cpu" | |
| ) | |
| # Example inference | |
| waveform = ... # Load your audio as numpy array | |
| transcription = model(waveform) | |
| print(transcription[0][0]) # Print the transcription | |
| ``` |