Instructions to use Richard-ZZZZZ/wm_ltx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Richard-ZZZZZ/wm_ltx with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Richard-ZZZZZ/wm_ltx", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Richard-ZZZZZ/wm_ltx", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
wm-serving Infer Ready
This repo is a slimmed inference-only export for internal visual quality testing.
Included
packages/ltx-corepackages/ltx-pipelineschunkwise_overlap8_blend_async_decode.pyscripts/- copied
test_data/
Reused From Existing Repo
checkpoints -> /mnt/data04/144632/xixu.hu@videorebirth.com/world/wm-serving/checkpoints- distilled LoRA default:
/mnt/data04/144632/xixu.hu@videorebirth.com/LTX-2/LTX-2/ltx-2-19b-distilled-lora-384.safetensors
Default Model Choices
- New DIT:
checkpoints/0517-sfpp-4step-distil_generator_weights_step_04000_merged.safetensors - Fine-tuned decoder:
checkpoints/ltx2_vae_decoder_tune_lpips0.01_fdl10.1_lr1e-6_ckpt10800_vae_only.ckpt - Spatial upsampler:
checkpoints/latent_upsampler
Environment
cd /mnt/data04/144632/xixu.hu@videorebirth.com/world/wm-serving-infer-ready
. env.sh
env.sh is copied from the original repo and only handles repo-local dependency setup:
uv sync --frozen
source .venv/bin/activate
This ready repo does not add an extra CUDA / NVIDIA installer script.
Three 1024 Modes
All three scripts default to:
HEIGHT=1024WIDTH=1536- copied
test_data/ - new DIT
- 10800 VAE decoder
1. One-stage 1024
bash scripts/infer_ti2vid_one_stage_1024.sh
2. Two-stage refine 1024
Flow:
stage1 DIT -> upsampler -> stage2 refine -> 10800 decoder
bash scripts/infer_ti2vid_two_stage_refine_1024.sh
3. Two-stage direct decoder 1024
Flow:
stage1 DIT -> upsampler -> 10800 decoder
bash scripts/infer_ti2vid_two_stage_ft_decoder_1024.sh
Batch Comparison
Run all three variants on the copied test_data/ cases:
bash scripts/batch_compare_1024_variants.sh
Outputs will be written under:
results/batch_compare_1024_<timestamp>/
Notes
scripts/batch_run_ti2vid_cases.shnow defaults toscripts/infer_ti2vid_one_stage_1024.sh- Override any default with environment variables, for example:
GPU_ID=1 IMAGE_PATH=test_data/image/zelda_water.png \
OUTPUT_PATH=results/custom/zelda_water.mp4 \
bash scripts/infer_ti2vid_two_stage_ft_decoder_1024.sh
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support