--- library_name: diffusers pipeline_tag: text-to-video base_model: Lightricks/LTX-2.3 tags: - video-generation - text-to-video - ltx - ltx-2 license: other license_name: ltx-video-2-open-source-license license_link: https://huggingface.co/Lightricks/LTX-2.3/blob/main/LICENSE --- # LTX-2.3 (Diffusers) Diffusers-format weights for [Lightricks/LTX-2.3](https://huggingface.co/Lightricks/LTX-2.3) — a DiT-based foundation model that jointly generates synchronized video and audio. A distilled variant (8 steps, CFG=1) is available at [`diffusers/LTX-2.3-Distilled-Diffusers`](https://huggingface.co/diffusers/LTX-2.3-Distilled-Diffusers). ## Usage Requires a recent build of `diffusers` with LTX-2 support: ```bash pip install -U git+https://github.com/huggingface/diffusers ``` ### Text-to-video + audio ```python import torch from diffusers import LTX2Pipeline from diffusers.pipelines.ltx2.export_utils import encode_video from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT pipe = LTX2Pipeline.from_pretrained( "diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16 ) pipe.enable_model_cpu_offload() prompt = "A flowing river in a forest at golden hour, gentle wind in the leaves." frame_rate = 24.0 video, audio = pipe( prompt=prompt, negative_prompt=DEFAULT_NEGATIVE_PROMPT, width=768, height=512, num_frames=121, frame_rate=frame_rate, num_inference_steps=30, guidance_scale=3.0, output_type="np", return_dict=False, ) encode_video( video[0], fps=frame_rate, audio=audio[0].float().cpu(), audio_sample_rate=pipe.vocoder.config.output_sampling_rate, output_path="ltx2_t2v.mp4", ) ``` ### First-last-frame-to-video (FLF2V) ```python import torch from diffusers import LTX2ConditionPipeline from diffusers.pipelines.ltx2.pipeline_ltx2_condition import LTX2VideoCondition from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT from diffusers.utils import load_image pipe = LTX2ConditionPipeline.from_pretrained( "diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16 ) pipe.enable_model_cpu_offload() first_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flf2v_input_first_frame.png") last_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flf2v_input_last_frame.png") conditions = [ LTX2VideoCondition(frames=first_image, index=0, strength=1.0), LTX2VideoCondition(frames=last_image, index=-1, strength=1.0), ] prompt = "CG animation style, a small blue bird takes off from the ground, flapping its wings." frame_rate = 24.0 video = pipe( conditions=conditions, prompt=prompt, negative_prompt=DEFAULT_NEGATIVE_PROMPT, width=768, height=512, num_frames=121, frame_rate=frame_rate, num_inference_steps=40, guidance_scale=4.0, output_type="np", return_dict=False, ) ``` ### IC-LoRA (camera control) ```python import torch from diffusers import LTX2InContextPipeline from diffusers.pipelines.ltx2.export_utils import encode_video from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT pipe = LTX2InContextPipeline.from_pretrained( "diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16 ) pipe.enable_model_cpu_offload() pipe.load_lora_weights( "Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-In", adapter_name="ic_lora", weight_name="ltx-2-19b-lora-camera-control-dolly-in.safetensors", ) pipe.set_adapters("ic_lora", 1.0) prompt = "A flowing river in a forest" frame_rate = 24.0 video, audio = pipe( prompt=prompt, negative_prompt=DEFAULT_NEGATIVE_PROMPT, width=768, height=512, num_frames=121, frame_rate=frame_rate, num_inference_steps=30, guidance_scale=3.0, output_type="np", return_dict=False, ) encode_video( video[0], fps=frame_rate, audio=audio[0].float().cpu(), audio_sample_rate=pipe.vocoder.config.output_sampling_rate, output_path="ltx2_ic_lora.mp4", ) ``` ## Notes - `width` and `height` must be divisible by 32; `num_frames` must equal `8k + 1`. - See the [Diffusers LTX-2 docs](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx2) for multimodal guidance, prompt enhancement, and the upscaling/refinement pipeline. ## License These weights are released under the [LTX Video 2 Open Source License](https://huggingface.co/Lightricks/LTX-2.3/blob/main/LICENSE).