yongjielv commited on
Commit
07288e3
Β·
verified Β·
1 Parent(s): 97228a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -1,5 +1,7 @@
 
 
1
  ## Key Features
2
- - πŸš€ **Unified Representation:** A single semantic-acoustic unified representation for both understanding and generation tasks.
3
  - 🎧 **High-Fidelity Reconstruction:** Achieve high-fidelity audio generation by modeling continuous features with a VAE, minimizing information loss and preserving intricate acoustic textures.
4
  - 🌐 **Convolution-Free Efficiency:** Built on a pure causal transformer architecture, completely eliminating convolutional layers for superior efficiency and a simpler design.
5
 
@@ -124,7 +126,7 @@ torchaudio.save('./1089-134686-0000_reconstruct.wav', output_waveform.cpu()[0],
124
  <td align="center">0.91</td>
125
  </tr>
126
  <tr>
127
- <td align="left"><strong>Ming-UniAudio-Tokenizer(ours)</td>
128
  <td align="center">50</td>
129
  <td align="center"><b>4.21</b></td>
130
  <td align="center"><b>0.96</b></td>
@@ -189,7 +191,7 @@ torchaudio.save('./1089-134686-0000_reconstruct.wav', output_waveform.cpu()[0],
189
  <td>31.73</td>
190
  </tr>
191
  <tr>
192
- <td><strong>Ming-UniAudio(ours)</td>
193
  <td>2.84</td>
194
  <td>1.62</td>
195
  <td><strong>9.80</strong></td>
@@ -251,7 +253,7 @@ torchaudio.save('./1089-134686-0000_reconstruct.wav', output_waveform.cpu()[0],
251
  <td align="center">0.51</td>
252
  </tr>
253
  <tr>
254
- <td align="left"><strong>Ming-UniAudio(ours)</td>
255
  <td align="center"><b>0.95</b></td>
256
  <td align="center">0.70</td>
257
  <td align="center">1.85</td>
 
1
+ <p align="center">πŸ“‘ <a href="">Technical Report</a>ο½œπŸ“–<a href="https://xqacmer.github.io/Ming-Unitok-Audio.github.io">Project Page</a> ο½œπŸ€— <a href="https://huggingface.co/inclusionAI/MingTok-Audio">Hugging Face</a>| πŸ€– <a href="https://modelscope.cn/models/inclusionAI/MingTok-Audio">ModelScope</a>
2
+
3
  ## Key Features
4
+ - πŸš€ **Unified Representation:** A single semantic-acoustic unified continuous representation for both understanding and generation tasks.
5
  - 🎧 **High-Fidelity Reconstruction:** Achieve high-fidelity audio generation by modeling continuous features with a VAE, minimizing information loss and preserving intricate acoustic textures.
6
  - 🌐 **Convolution-Free Efficiency:** Built on a pure causal transformer architecture, completely eliminating convolutional layers for superior efficiency and a simpler design.
7
 
 
126
  <td align="center">0.91</td>
127
  </tr>
128
  <tr>
129
+ <td align="left"><strong>MingTok-Audio(ours)</td>
130
  <td align="center">50</td>
131
  <td align="center"><b>4.21</b></td>
132
  <td align="center"><b>0.96</b></td>
 
191
  <td>31.73</td>
192
  </tr>
193
  <tr>
194
+ <td><strong>Ming-UniAudio-16A3B(ours)</td>
195
  <td>2.84</td>
196
  <td>1.62</td>
197
  <td><strong>9.80</strong></td>
 
253
  <td align="center">0.51</td>
254
  </tr>
255
  <tr>
256
+ <td align="left"><strong>Ming-UniAudio-16A3B(ours)</td>
257
  <td align="center"><b>0.95</b></td>
258
  <td align="center">0.70</td>
259
  <td align="center">1.85</td>