ASLP-lab commited on
Commit
71786c9
·
verified ·
1 Parent(s): 17193f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -21
README.md CHANGED
@@ -1,31 +1,18 @@
1
- ![WenetSpeech-Pipe Overview Image](https://huggingface.co/username/model_name/resolve/main/my_image.png)
2
 
3
- ## 👉🏻 CosyVoice 👈🏻
4
- **CosyVoice 2.0**: [Demos](https://funaudiollm.github.io/cosyvoice2/); [Paper](https://arxiv.org/abs/2412.10117); [Modelscope](https://www.modelscope.cn/studios/iic/CosyVoice2-0.5B); [HuggingFace](https://huggingface.co/spaces/FunAudioLLM/CosyVoice2-0.5B)
 
5
 
6
 
7
  ## Highlight🔥
8
 
9
  **CosyVoice 2.0** has been released! Compared to version 1.0, the new version offers more accurate, more stable, faster, and better speech generation capabilities.
10
- ### Multilingual
11
- - **Supported Language**: Chinese, English, Japanese, Korean, Chinese dialects (Cantonese, Sichuanese, Shanghainese, Tianjinese, Wuhanese, etc.)
12
- - **Crosslingual & Mixlingual**:Support zero-shot voice cloning for cross-lingual and code-switching scenarios.
13
- ### Ultra-Low Latency
14
- - **Bidirectional Streaming Support**: CosyVoice 2.0 integrates offline and streaming modeling technologies.
15
- - **Rapid First Packet Synthesis**: Achieves latency as low as 150ms while maintaining high-quality audio output.
16
- ### High Accuracy
17
- - **Improved Pronunciation**: Reduces pronunciation errors by 30% to 50% compared to CosyVoice 1.0.
18
- - **Benchmark Achievements**: Attains the lowest character error rate on the hard test set of the Seed-TTS evaluation set.
19
- ### Strong Stability
20
- - **Consistency in Timbre**: Ensures reliable voice consistency for zero-shot and cross-language speech synthesis.
21
- - **Cross-language Synthesis**: Marked improvements compared to version 1.0.
22
- ### Natural Experience
23
- - **Enhanced Prosody and Sound Quality**: Improved alignment of synthesized audio, raising MOS evaluation scores from 5.4 to 5.53.
24
- - **Emotional and Dialectal Flexibility**: Now supports more granular emotional controls and accent adjustments.
25
 
26
  ## Roadmap
27
 
28
- - [x] 2024/12
29
 
30
  - [x] 25hz cosyvoice 2.0 released
31
 
@@ -89,5 +76,5 @@ for i, j in enumerate(cosyvoice.inference_instruct2('收到好友从远方寄来
89
  torchaudio.save('instruct_{}.wav'.format(i), j['tts_speech'], cosyvoice.sample_rate)
90
  ```
91
 
92
- ## Disclaimer
93
- The content provided above is for academic purposes only and is intended to demonstrate technical capabilities. Some examples are sourced from the internet. If any content infringes on your rights, please contact us to request its removal.
 
1
+ ![WenetSpeech-Yue](https://huggingface.co/datasets/ASLP-lab/WenetSpeech-Yue/resolve/main/wenetspeech_pipe.svg)
2
 
3
+
4
+ ## 👉🏻 WenetSpeech-Yue 👈🏻
5
+ **WenetSpeech-Yue**: [Demos](https://aslp-lab.github.io/WenetSpeech-Yue/); [Paper](https://arxiv.org/abs/2509.03959); [Github](https://github.com/ASLP-lab/WenetSpeech-Yue); [HuggingFace](https://huggingface.co/datasets/ASLP-lab/WenetSpeech-Yue)
6
 
7
 
8
  ## Highlight🔥
9
 
10
  **CosyVoice 2.0** has been released! Compared to version 1.0, the new version offers more accurate, more stable, faster, and better speech generation capabilities.
11
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## Roadmap
14
 
15
+ - [x] 2025/9
16
 
17
  - [x] 25hz cosyvoice 2.0 released
18
 
 
76
  torchaudio.save('instruct_{}.wav'.format(i), j['tts_speech'], cosyvoice.sample_rate)
77
  ```
78
 
79
+ ## Contact
80
+ If you are interested in leaving a message to our research team, feel free to email lhli@mail.nwpu.edu.cn or gzhao@mail.nwpu.edu.cn.