Instructions to use cvssp/audioldm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use cvssp/audioldm with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("cvssp/audioldm", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Question regarding the pre-trained checkpoint
#1
by soujanyaporia - opened
Hi,
Excellent work! Thanks. Can I ask how are the checkpoints shared in this repo obtained? Did you follow the complete process as explained in the paper?
Also, could you let me know if there is a trainer code?
Thanks.
-Soujanya
Hey @soujanyaporia - this checkpoint is obtained by converting the weights from the official checkpoint (under ckpt). The code is made to match the results of the official implementation one-to-one: https://github.com/haoheliu/AudioLDM. Thus, the model is entirely equivalent to the one released with the AudioLDM paper.