nielsr HF Staff commited on
Commit
d3c4965
·
verified ·
1 Parent(s): 13787d9

Add pipeline tag

Browse files

This PR adds the `audio-to-audio` pipeline tag to the model metadata. This ensures the model is correctly categorized and discoverable under audio tasks on the Hugging Face Hub. It also maintains the existing documentation, including benchmarks and usage examples.

Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: other
3
- license_name: license-term-of-universal-audio-tokenizer
4
  language:
5
  - en
6
  - zh
 
 
7
  tags:
8
  - audio
9
  - audio-tokenizer
@@ -11,11 +11,12 @@ tags:
11
  - speech
12
  - sound
13
  - music
 
14
  ---
 
15
  # Universal Audio Tokenizer: Empowering Semantic Speech Tokenizers with General Audio Perception
16
 
17
- **Universal Audio Tokenizer** is a compact single-codebook audio tokenizer that unifies general audio perception and
18
- linguistic alignment for downstream Audio-LLMs.
19
 
20
  📄 [Paper](https://arxiv.org/abs/2605.31521) | 💻 [GitHub](https://github.com/Tencent/Universal_Audio_Tokenizer)
21
 
@@ -108,6 +109,7 @@ Also, you can directly run the inference code snippet below:
108
  ```python
109
  import os
110
  import torch
 
111
  from transformers import WhisperFeatureExtractor
112
  from src.model.modeling_whisper import WhisperVQEncoder
113
  from src.model.flow_inference import AudioDecoder
@@ -167,7 +169,7 @@ Our Universal Audio Tokenizer achieves high-quality speech reconstruction with a
167
 
168
  ### Superior Downstream Audio-LLM Performance
169
 
170
- When integrated with the Qwen2.5 LLM backbone, our Universal Audio Tokenizer yields superior performance on a wide range of downstream audio understanding benchmarks and controllable TTS synthesis tasks, demonstrating its effectiveness as a unified audio input/output interface for Audio-LLMs.
171
 
172
  #### Audio Understanding
173
 
@@ -208,4 +210,4 @@ If you find our code or model useful for your research, please cite:
208
 
209
  ## License
210
 
211
- This project is licensed under the [License Term of Universal_Audio_Tokenizer](LICENSE).
 
1
  ---
 
 
2
  language:
3
  - en
4
  - zh
5
+ license: other
6
+ license_name: license-term-of-universal-audio-tokenizer
7
  tags:
8
  - audio
9
  - audio-tokenizer
 
11
  - speech
12
  - sound
13
  - music
14
+ pipeline_tag: audio-to-audio
15
  ---
16
+
17
  # Universal Audio Tokenizer: Empowering Semantic Speech Tokenizers with General Audio Perception
18
 
19
+ **Universal Audio Tokenizer** (UniAudio-Token) is a compact single-codebook audio tokenizer that unifies general audio perception and linguistic alignment for downstream Audio-LLMs.
 
20
 
21
  📄 [Paper](https://arxiv.org/abs/2605.31521) | 💻 [GitHub](https://github.com/Tencent/Universal_Audio_Tokenizer)
22
 
 
109
  ```python
110
  import os
111
  import torch
112
+ from huggingface_hub import snapshot_download
113
  from transformers import WhisperFeatureExtractor
114
  from src.model.modeling_whisper import WhisperVQEncoder
115
  from src.model.flow_inference import AudioDecoder
 
169
 
170
  ### Superior Downstream Audio-LLM Performance
171
 
172
+ When integrated with the Qwen2.5 LLM backbone, our Universal Audio Tokenizer yields superior performance on a wide range of downstream audio understanding benchmarks and controllable TTS synthesis tasks.
173
 
174
  #### Audio Understanding
175
 
 
210
 
211
  ## License
212
 
213
+ This project is licensed under the [License Term of Universal_Audio_Tokenizer](LICENSE).