import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("teticio/audio-encoder", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]This model encodes audio files into vectors of 100 dimensions. It was trained on a million Spotify playlists and tracks. The details can be found here.
To encode an audio first install the package with
pip install audiodiffusion
and then run
from audiodiffusion.audio_encoder import AudioEncoder
audio_encoder = AudioEncoder.from_pretrained("teticio/audio-encoder")
audio_encoder.encode(<list of audio files>)
- Downloads last month
- 52
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support