MachineDelusions's picture
Add README and example videos
4d4ea09
|
raw
history blame
3.11 kB

LTX-2 Image-to-Video Adapter LoRA

A high-rank LoRA adapter for LTX-Video 2 that substantially improves image-to-video generation quality. No complex workflows, no image preprocessing, no compression tricks -- just a direct image embedding pipeline that works.

What This Is

This LoRA was trained on 30,000 generated videos spanning a wide range of subjects, styles, and motion types. The result is a highly generalized adapter that strengthens LTX-2's ability to take a single image and produce coherent, high-fidelity video from it.

Key Specs

Parameter Value
Base Model LTX-Video 2
LoRA Rank 256
Training Set ~30,000 generated videos
Training Scope Visual only (no explicit audio training)

What It Does

  • Improved image fidelity -- the generated video maintains stronger adherence to the source image with less drift or distortion across frames.
  • Better motion coherence -- subjects move more naturally and consistently throughout the clip.
  • Broader generalization -- performs well across diverse subjects and scenes without needing per-category tuning.
  • Zero-workflow overhead -- no ControlNet, no IP-Adapter stacking, no image manipulation required. Load the LoRA, attach an image embedding, prompt, and generate.

A Note on Audio

Audio was not explicitly trained into this LoRA. However, due to the nature of how LTX-2 handles its latent space, there are subtle shifts in audio output compared to the base model. This is a side effect of the training process, not an intentional feature.

Usage (ComfyUI)

  1. Place the LoRA file in your ComfyUI/models/loras/ directory.
  2. Add an LTX-2 model loader node and load the base LTX-2 checkpoint.
  3. Add a Load LoRA node and select this adapter.
  4. Connect an image embedding node with your source image.
  5. Add your text prompt and generate.

No additional nodes, preprocessing steps, or auxiliary models are needed.

Examples

Three reference videos demonstrating the adapter's output quality:

Model Details

  • Architecture: LoRA (Low-Rank Adaptation) applied to LTX-Video 2's transformer layers
  • Rank 256 provides a high-capacity adaptation while remaining efficient to load and merge
  • Training data was intentionally diverse to avoid overfitting to any single domain, producing a general-purpose image-to-video adapter rather than a style-specific fine-tune

License

Please refer to the LTX-Video license for base model terms.