frontierai commited on
Commit
a7b3b7f
·
verified ·
1 Parent(s): 53552c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -64,17 +64,16 @@ library_name: transformers
64
 
65
  ## VibeVoice-ASR
66
  [![GitHub](https://img.shields.io/badge/GitHub-Repo-black?logo=github)](https://github.com/microsoft/VibeVoice)
67
- [![Live Playground](https://img.shields.io/badge/Live-Playground-green?logo=gradio)](https://aka.ms/vibevoice-asr
68
- [![Technical Report](https://arxiv.org/pdf/2601.18184)]
69
- [![Finetuning](https://github.com/microsoft/VibeVoice/blob/main/finetuning-asr/README.md)]
70
 
71
  **VibeVoice-ASR** is a unified speech-to-text model designed to handle **60-minute long-form audio** in a single pass, generating structured transcriptions containing **Who (Speaker), When (Timestamps), and What (Content)**, with support for **Customized Hotwords** and over **50 languages**.
72
 
73
  ➡️ **Code:** [microsoft/VibeVoice](https://github.com/microsoft/VibeVoice)<br>
74
- ➡️ **Demo:** [VibeVoice-ASR-Demo](https://aka.ms/vibevoice-asr)
75
- ➡️ **Report:** [VibeVoice-ASR Technical Report](https://arxiv.org/pdf/2601.18184)
76
- ➡️ **Finetuning:** [Finetuning](https://github.com/microsoft/VibeVoice/blob/main/finetuning-asr/README.md)
77
- ➡️ **vLLM:** [vLLM-VibeVoice-ASR](https://github.com/microsoft/VibeVoice/blob/main/docs/vibevoice-vllm-asr.md)
78
 
79
  <p align="left">
80
  <img src="figures/VibeVoice_ASR_archi.png" alt="VibeVoice-ASR Architecture" height="250px">
 
64
 
65
  ## VibeVoice-ASR
66
  [![GitHub](https://img.shields.io/badge/GitHub-Repo-black?logo=github)](https://github.com/microsoft/VibeVoice)
67
+ [![Live Playground](https://img.shields.io/badge/Live-Playground-green?logo=gradio)](https://aka.ms/vibevoice-asr)
68
+ [![Technical Report](https://img.shields.io/badge/arXiv-2601.18184-b31b1b?logo=arxiv)](https://arxiv.org/pdf/2601.18184)
 
69
 
70
  **VibeVoice-ASR** is a unified speech-to-text model designed to handle **60-minute long-form audio** in a single pass, generating structured transcriptions containing **Who (Speaker), When (Timestamps), and What (Content)**, with support for **Customized Hotwords** and over **50 languages**.
71
 
72
  ➡️ **Code:** [microsoft/VibeVoice](https://github.com/microsoft/VibeVoice)<br>
73
+ ➡️ **Demo:** [VibeVoice-ASR-Demo](https://aka.ms/vibevoice-asr)<br>
74
+ ➡️ **Report:** [VibeVoice-ASR Technical Report](https://arxiv.org/pdf/2601.18184)<br>
75
+ ➡️ **Finetuning:** [Finetuning](https://github.com/microsoft/VibeVoice/blob/main/finetuning-asr/README.md)<br>
76
+ ➡️ **vLLM:** [vLLM-VibeVoice-ASR](https://github.com/microsoft/VibeVoice/blob/main/docs/vibevoice-vllm-asr.md)<br>
77
 
78
  <p align="left">
79
  <img src="figures/VibeVoice_ASR_archi.png" alt="VibeVoice-ASR Architecture" height="250px">