Image-to-Video
Diffusers
text-to-video
video-to-video
image-text-to-video
audio-to-video
text-to-audio
video-to-audio
audio-to-audio
text-to-audio-video
image-to-audio-video
image-text-to-audio-video
ltx-2
ltx-2-3
ltx-video
ltxv
lightricks
Instructions to use Lightricks/LTX-2.3-nvfp4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Lightricks/LTX-2.3-nvfp4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Lightricks/LTX-2.3-nvfp4", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
NVFP4 for 16GB of VRAM 5070ti?
#3
by hugless - opened
Hi All.
is this model constructed in a way that must be loaded in vram all at once? meaning it wont work in my kit? thanks
do you mean like is it tansformer/vae/text projection in one? yes.
but there is a lot of tools to offset things, all depends on what your "kit" is, a lot of people are running ltx2.3 in 16 and even 12gb vram, even heard of a few pushing 8gb, but, yeah, that's gotta be slow.
Probably necro-threading, but the FP8 model is higher quality compared to the NVFP4 model. There is noticeable pixelation in the NVFP4 model.