Metacebertrunk commited on
Commit
f72431c
·
verified ·
1 Parent(s): 3ec18ce

Upload project info

Browse files
Files changed (1) hide show
  1. README.md +0 -8
README.md CHANGED
@@ -19,9 +19,6 @@
19
 
20
  </p>
21
 
22
- <p align="center">
23
- <a href="https://arxiv.org/pdf/2512.20151"><img src="QuarkAudio.jpg" width="70%" /></a>
24
- </p>
25
 
26
  ## Introduction
27
  This project contains a series of works developed for audio (including speech, music, and general audio events) processing and generation, which helps reproducible research in the field of audio. The target of **QuarkAudio** is to explore a unified framework to handle **different audio processing and generation tasks**, including:
@@ -56,8 +53,3 @@ In addition to the frameworks for specific audio tasks, **QuarkAudio** also prov
56
  - **2025/12/24**: We release [***QuarkAudio***](https://github.com/alibaba/unified-audio), an Open-Source Project to Unify Audio Processing and Generation.[![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/pdf/2512.20151). The code is publicly available at: [QuarkAudio-HCodec](https://github.com/alibaba/unified-audio/tree/main/QuarkAudio-HCodec), along with pretrained models and inference examples.
57
  - **2025/10/26**: We release [***UniTok-Audio***](https://github.com/alibaba/unified-audio), The system supports target speaker extraction, universal speech enhancement, Speech Restoration, Voice Conversion, Language-Queried Audio Source Separation, Audio Tokenization,[***demo***](https://alibaba.github.io/unified-audio/), [![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/abs/2510.26372). Code will comming soon.
58
  - **2025/09/22**: We release [***UniSE***](https://github.com/hyyan2k/UniSE), a foundation model for unified speech generation. The system supports target speaker extraction, universal speech enhancement.[***demo***](https://hyyan2k.github.io/UniSE/), [![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/abs/2510.20441). The code is publicly available at: [QuarkAudio-UniSE](https://github.com/alibaba/unified-audio/tree/main/QuarkAudio-UniSE), along with pretrained models and inference examples.
59
- ## key Works
60
- ### UniSE
61
- [UniSE](https://github.com/alibaba/unified-audio/tree/main/QuarkAudio-UniSE): A Unified Framework for Decoder-Only Autoregressive LM-Based Speech Enhancement[![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/abs/2510.20441)
62
- supported tasks: **SR**, **TSE**, **SS**
63
-
 
19
 
20
  </p>
21
 
 
 
 
22
 
23
  ## Introduction
24
  This project contains a series of works developed for audio (including speech, music, and general audio events) processing and generation, which helps reproducible research in the field of audio. The target of **QuarkAudio** is to explore a unified framework to handle **different audio processing and generation tasks**, including:
 
53
  - **2025/12/24**: We release [***QuarkAudio***](https://github.com/alibaba/unified-audio), an Open-Source Project to Unify Audio Processing and Generation.[![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/pdf/2512.20151). The code is publicly available at: [QuarkAudio-HCodec](https://github.com/alibaba/unified-audio/tree/main/QuarkAudio-HCodec), along with pretrained models and inference examples.
54
  - **2025/10/26**: We release [***UniTok-Audio***](https://github.com/alibaba/unified-audio), The system supports target speaker extraction, universal speech enhancement, Speech Restoration, Voice Conversion, Language-Queried Audio Source Separation, Audio Tokenization,[***demo***](https://alibaba.github.io/unified-audio/), [![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/abs/2510.26372). Code will comming soon.
55
  - **2025/09/22**: We release [***UniSE***](https://github.com/hyyan2k/UniSE), a foundation model for unified speech generation. The system supports target speaker extraction, universal speech enhancement.[***demo***](https://hyyan2k.github.io/UniSE/), [![arXiv](https://img.shields.io/badge/arXiv-Paper-COLOR.svg)](https://arxiv.org/abs/2510.20441). The code is publicly available at: [QuarkAudio-UniSE](https://github.com/alibaba/unified-audio/tree/main/QuarkAudio-UniSE), along with pretrained models and inference examples.