AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Paper β’ 2301.12503 β’ Published β’ 1
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("haoheliu/AudioLDM-S-Full", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Generate any audio from text using your imagination
https://huggingface.co/spaces/haoheliu/audioldm-text-to-audio-generation
TODO
TODO
TODO
TODO
TODO
TODO
BibTeX:
@article{liu2023audioldm,
title={AudioLDM: Text-to-Audio Generation with Latent Diffusion Models},
author={Liu, Haohe and Chen, Zehua and Yuan, Yi and Mei, Xinhao and Liu, Xubo and Mandic, Danilo and Wang, Wenwu and Plumbley, Mark D},
journal={arXiv preprint arXiv:2301.12503},
year={2023}
}