Audio-Text-to-Text
Transformers
Safetensors
ASR
Diarization
Speech-to-Text
Transcription
Eval Results
Instructions to use microsoft/VibeVoice-ASR-HF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/VibeVoice-ASR-HF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("microsoft/VibeVoice-ASR-HF", dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 537 Bytes
8623f43 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | {
"audio_bos_token": "<|object_ref_start|>",
"audio_duration_token": "<|AUDIO_DURATION|>",
"audio_eos_token": "<|object_ref_end|>",
"audio_token": "<|box_start|>",
"feature_extractor": {
"eps": 1e-06,
"feature_extractor_type": "VibeVoiceAcousticTokenizerFeatureExtractor",
"feature_size": 1,
"normalize_audio": true,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": true,
"sampling_rate": 24000,
"target_dB_FS": -25
},
"processor_class": "VibeVoiceAsrProcessor"
}
|