first commit
Browse files
README.md
CHANGED
|
@@ -10,7 +10,9 @@ tags:
|
|
| 10 |
**Auffusion** is a latent diffusion model (LDM) for text-to-audio (TTA) generation. **Auffusion** can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment. Our objective and subjective evaluations demonstrate that Auffusion surpasses previous TTA approaches using limited data and computational resource. We release our model, inference code, and pre-trained checkpoints for the research community.
|
| 11 |
|
| 12 |
📣 We are releasing **Auffusion-Full-no-adapter** which was pre-trained on all datasets described in paper and created for easy use of audio manipulation.
|
|
|
|
| 13 |
📣 We are releasing **Auffusion-Full** which was pre-trained on all datasets described in paper.
|
|
|
|
| 14 |
📣 We are releasing **Auffusion** which was pre-trained on **AudioCaps**.
|
| 15 |
|
| 16 |
## Auffusion Model Family
|
|
|
|
| 10 |
**Auffusion** is a latent diffusion model (LDM) for text-to-audio (TTA) generation. **Auffusion** can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment. Our objective and subjective evaluations demonstrate that Auffusion surpasses previous TTA approaches using limited data and computational resource. We release our model, inference code, and pre-trained checkpoints for the research community.
|
| 11 |
|
| 12 |
📣 We are releasing **Auffusion-Full-no-adapter** which was pre-trained on all datasets described in paper and created for easy use of audio manipulation.
|
| 13 |
+
|
| 14 |
📣 We are releasing **Auffusion-Full** which was pre-trained on all datasets described in paper.
|
| 15 |
+
|
| 16 |
📣 We are releasing **Auffusion** which was pre-trained on **AudioCaps**.
|
| 17 |
|
| 18 |
## Auffusion Model Family
|