LTX2 Audio-to-Video diffusers pipeline

Load with diffusers

from diffusers import DiffusionPipeline                                                                                                                
from diffusers.utils import load_image
import torch
                                                                                                                                                     
pipe = DiffusionPipeline.from_pretrained(                                                                                                              
  "Lightricks/LTX-2",                                                                                                                                
  custom_pipeline="multimodalart/ltx2-audio-to-video",                                                                                                  
  torch_dtype=torch.bfloat16                                                                                                                         
)                                                                                                                                                      
pipe.to("cuda")

                                                                                                                               
image = load_image("photo_2.jpeg")
audio = "your_audio.wav"                                                                                                                

pipe.load_lora_weights("Lightricks/LTX-2-19b-LoRA-Camera-Control-Static") #this lora helps with keeping the camera steady for lip-sync purposes

video, audio = pipe(                                                                                                                                   
  image=image,                                                                                                                                       
  audio=audio, 
  prompt="A person speaking, lips moving in sync with the words, talking head",                                                                                                                         
  num_frames=141,                                                                                                                                    
  frame_rate=24.0,                                                                                                                                   
  return_dict=False,                                                                                                                                 
)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for multimodalart/ltx2-audio-to-video

Base model

Lightricks/LTX-2
Finetuned
(33)
this model