Instructions to use microsoft/unispeech-sat-large-sd with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/unispeech-sat-large-sd with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForAudioFrameClassification processor = AutoProcessor.from_pretrained("microsoft/unispeech-sat-large-sd") model = AutoModelForAudioFrameClassification.from_pretrained("microsoft/unispeech-sat-large-sd") - Notebooks
- Google Colab
- Kaggle
How to use microsoft/unispeech-sat-large-sd for diarization
#1
by shripadbhat - opened
I would like to use the model, to diarize audio file and get timestamps of each speaker, how can I achieve that ?
I've tried it even in small audio files and always get CUDA oom. I think maybe it's scientifically good but practically not usable.