openslr/librispeech_asr
Viewer • Updated • 585k • 100k • 222
How to use nguyenvulebinh/wav2vec2-noisy with Transformers:
# Load model directly
from transformers import AutoProcessor, AutoModelForPreTraining
processor = AutoProcessor.from_pretrained("nguyenvulebinh/wav2vec2-noisy")
model = AutoModelForPreTraining.from_pretrained("nguyenvulebinh/wav2vec2-noisy")The base model pretrained on 16kHz sampled speech-augmented audio. The audio comes from 960h Libris dataset that is augmented as follows:
The ambient noise dataset includes MUSAN and WHAM (a total of 189 hours, including music, speech, and environmental noise). The reverb dataset is from Room RIR and BUT Speech@FIT (2650 room impulse response signals).
The model parameters are made available for non-commercial use only under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. You can find details at: https://creativecommons.org/licenses/by-nc/4.0/legalcode