Image-to-Video
Diffusers
Safetensors
LTX2Pipeline
text-to-video
video-to-video
image-text-to-video
audio-to-video
text-to-audio
video-to-audio
audio-to-audio
text-to-audio-video
image-to-audio-video
image-text-to-audio-video
ltx-2
ltx-2-3
ltx-video
ltxv
lightricks
Instructions to use diffusers/LTX-2.3-Diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use diffusers/LTX-2.3-Diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("diffusers/LTX-2.3-Diffusers", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
File size: 4,693 Bytes
58eaf77 e385f40 58eaf77 e385f40 8eee8ed e385f40 8eee8ed 58eaf77 e385f40 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | ---
library_name: diffusers
pipeline_tag: image-to-video
base_model: Lightricks/LTX-2.3
tags:
- image-to-video
- text-to-video
- video-to-video
- image-text-to-video
- audio-to-video
- text-to-audio
- video-to-audio
- audio-to-audio
- text-to-audio-video
- image-to-audio-video
- image-text-to-audio-video
- ltx-2
- ltx-2-3
- ltx-video
- ltxv
- lightricks
license: other
license_name: ltx-video-2-open-source-license
license_link: https://huggingface.co/Lightricks/LTX-2.3/blob/main/LICENSE
---
# LTX-2.3 (Diffusers)
Diffusers-format weights for [Lightricks/LTX-2.3](https://huggingface.co/Lightricks/LTX-2.3) — a DiT-based foundation model that jointly generates synchronized video and audio.
A distilled variant (8 steps, CFG=1) is available at [`diffusers/LTX-2.3-Distilled-Diffusers`](https://huggingface.co/diffusers/LTX-2.3-Distilled-Diffusers).
## Usage
Requires a recent build of `diffusers` with LTX-2 support:
```bash
pip install -U git+https://github.com/huggingface/diffusers
```
### Text-to-video + audio
```python
import torch
from diffusers import LTX2Pipeline
from diffusers.pipelines.ltx2.export_utils import encode_video
from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT
pipe = LTX2Pipeline.from_pretrained(
"diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
prompt = "A flowing river in a forest at golden hour, gentle wind in the leaves."
frame_rate = 24.0
video, audio = pipe(
prompt=prompt,
negative_prompt=DEFAULT_NEGATIVE_PROMPT,
width=768,
height=512,
num_frames=121,
frame_rate=frame_rate,
num_inference_steps=30,
guidance_scale=3.0,
output_type="np",
return_dict=False,
)
encode_video(
video[0],
fps=frame_rate,
audio=audio[0].float().cpu(),
audio_sample_rate=pipe.vocoder.config.output_sampling_rate,
output_path="ltx2_t2v.mp4",
)
```
### First-last-frame-to-video (FLF2V)
```python
import torch
from diffusers import LTX2ConditionPipeline
from diffusers.pipelines.ltx2.pipeline_ltx2_condition import LTX2VideoCondition
from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT
from diffusers.utils import load_image
pipe = LTX2ConditionPipeline.from_pretrained(
"diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
first_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flf2v_input_first_frame.png")
last_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flf2v_input_last_frame.png")
conditions = [
LTX2VideoCondition(frames=first_image, index=0, strength=1.0),
LTX2VideoCondition(frames=last_image, index=-1, strength=1.0),
]
prompt = "CG animation style, a small blue bird takes off from the ground, flapping its wings."
frame_rate = 24.0
video = pipe(
conditions=conditions,
prompt=prompt,
negative_prompt=DEFAULT_NEGATIVE_PROMPT,
width=768,
height=512,
num_frames=121,
frame_rate=frame_rate,
num_inference_steps=40,
guidance_scale=4.0,
output_type="np",
return_dict=False,
)
```
### IC-LoRA (camera control)
```python
import torch
from diffusers import LTX2InContextPipeline
from diffusers.pipelines.ltx2.export_utils import encode_video
from diffusers.pipelines.ltx2.utils import DEFAULT_NEGATIVE_PROMPT
pipe = LTX2InContextPipeline.from_pretrained(
"diffusers/LTX-2.3-Diffusers", torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
pipe.load_lora_weights(
"Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-In",
adapter_name="ic_lora",
weight_name="ltx-2-19b-lora-camera-control-dolly-in.safetensors",
)
pipe.set_adapters("ic_lora", 1.0)
prompt = "A flowing river in a forest"
frame_rate = 24.0
video, audio = pipe(
prompt=prompt,
negative_prompt=DEFAULT_NEGATIVE_PROMPT,
width=768,
height=512,
num_frames=121,
frame_rate=frame_rate,
num_inference_steps=30,
guidance_scale=3.0,
output_type="np",
return_dict=False,
)
encode_video(
video[0],
fps=frame_rate,
audio=audio[0].float().cpu(),
audio_sample_rate=pipe.vocoder.config.output_sampling_rate,
output_path="ltx2_ic_lora.mp4",
)
```
## Notes
- `width` and `height` must be divisible by 32; `num_frames` must equal `8k + 1`.
- See the [Diffusers LTX-2 docs](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx2) for multimodal guidance, prompt enhancement, and the upscaling/refinement pipeline.
## License
These weights are released under the [LTX Video 2 Open Source License](https://huggingface.co/Lightricks/LTX-2.3/blob/main/LICENSE). |