File size: 3,566 Bytes
4d4ea09
 
 
 
 
 
5813c59
 
 
4d4ea09
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39c69b7
 
9245fa1
39c69b7
4d4ea09
 
39c69b7
 
9245fa1
4d4ea09
9245fa1
4d4ea09
9245fa1
4d4ea09
9245fa1
4d4ea09
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# LTX-2 Image-to-Video Adapter LoRA

A high-rank LoRA adapter for [LTX-Video 2](https://github.com/Lightricks/LTX-Video) that substantially improves image-to-video generation quality. No complex workflows, no image preprocessing, no compression tricks -- just a direct image embedding pipeline that works.

## What This Is

Out of the box, getting LTX-2 to reliably infer motion from a single image requires heavy workflow engineering -- ControlNet stacking, image preprocessing, latent manipulation, and careful node routing. The purpose of this LoRA is to eliminate that complexity entirely. It teaches the model to produce solid image-to-video results from a straightforward image embedding, no elaborate pipelines needed.

Trained on **30,000 generated videos** spanning a wide range of subjects, styles, and motion types, the result is a highly generalized adapter that strengthens LTX-2's image-to-video capabilities without any of the typical workflow overhead.

### Key Specs

| Parameter | Value |
|-----------|-------|
| **Base Model** | LTX-Video 2 |
| **LoRA Rank** | 256 |
| **Training Set** | ~30,000 generated videos |
| **Training Scope** | Visual only (no explicit audio training) |

## What It Does

- **Improved image fidelity** -- the generated video maintains stronger adherence to the source image with less drift or distortion across frames.
- **Better motion coherence** -- subjects move more naturally and consistently throughout the clip.
- **Broader generalization** -- performs well across diverse subjects and scenes without needing per-category tuning.

### A Note on Audio

Audio was **not** explicitly trained into this LoRA. However, due to the nature of how LTX-2 handles its latent space, there are subtle shifts in audio output compared to the base model. This is a side effect of the training process, not an intentional feature.

## Usage (ComfyUI)

1. Place the LoRA file in your `ComfyUI/models/loras/` directory.
2. Add an **LTX-2** model loader node and load the base LTX-2 checkpoint.
3. Add a **Load LoRA** node and select this adapter.
4. Connect an **image embedding** node with your source image.
5. Add your text prompt and generate.

### Workflow

![ComfyUI Workflow](https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/resolve/main/assets/image.png)

## Examples

Reference videos demonstrating the adapter's output quality:

<video src="https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/resolve/main/assets/AnimateDiff_00740.mp4" autoplay loop muted playsinline></video>

<video src="https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/resolve/main/assets/AnimateDiff_00774.mp4" autoplay loop muted playsinline></video>

<video src="https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/resolve/main/assets/AnimateDiff_00777.mp4" autoplay loop muted playsinline></video>

<video src="https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/resolve/main/assets/AnimateDiff_00778.mp4" autoplay loop muted playsinline></video>

## Model Details

- **Architecture:** LoRA (Low-Rank Adaptation) applied to LTX-Video 2's transformer layers
- **Rank 256** provides a high-capacity adaptation while remaining efficient to load and merge
- **Training data** was intentionally diverse to avoid overfitting to any single domain, producing a general-purpose image-to-video adapter rather than a style-specific fine-tune

## License

Please refer to the [LTX-Video license](https://github.com/Lightricks/LTX-Video) for base model terms.