Instructions to use KBLab/wav2vec2-large-voxrex with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use KBLab/wav2vec2-large-voxrex with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="KBLab/wav2vec2-large-voxrex")# Load model directly from transformers import AutoProcessor, AutoModelForPreTraining processor = AutoProcessor.from_pretrained("KBLab/wav2vec2-large-voxrex") model = AutoModelForPreTraining.from_pretrained("KBLab/wav2vec2-large-voxrex") - Notebooks
- Google Colab
- Kaggle
Wav2vec 2.0 large VoxRex (C)
Please note: The model hosted in this repository is a pretrained wav2vec2 without a CTC head, as such it cannot do speech-to-text. If you are interested in speech-to-text, see our finetuned version of this model, which can be found at KBLab/wav2vec2-large-voxrex-swedish. The weights found in this repository are from the pure acoustic model after unsupervised pretraining. This model is suitable for anyone interested in i) continued wav2vec2-pretraining with your own unsupervised data, ii) a feature extractor for finetuning your own downstream tasks (e.g. if you want to train your own CTC head, or an audio classifier).
Disclaimer: This is a work in progress.
Update 2022-01-08: Updated to VoxRex-C version, use git to get the older (B) version.
Update 2022-05-16: Paper is is here.
This model has been pretrained for 400,000 updates on the P4-10k corpus which contains 10 000 hours of swedish local public service radio as well as 1500 hours of audio books and other speech from KBs collections.
- Downloads last month
- 44