Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,7 @@ language:
|
|
| 7 |
- en
|
| 8 |
license: other
|
| 9 |
license_name: ltx-2-community-license
|
| 10 |
-
license_link: https://
|
| 11 |
pipeline_tag: any-to-any
|
| 12 |
tags:
|
| 13 |
- ltx-video
|
|
@@ -18,7 +18,8 @@ pinned: true
|
|
| 18 |
|
| 19 |
# LTX-2 19B IC-LoRA Union Control
|
| 20 |
|
| 21 |
-
This is a unified control IC-LoRA trained on top of **LTX-2-19b**, enabling multiple control signals to be used for video generation from text and reference frames.
|
|
|
|
| 22 |
|
| 23 |
It is based on the [LTX-2](https://huggingface.co/papers/2601.03233) foundation model.
|
| 24 |
|
|
@@ -29,9 +30,9 @@ It is based on the [LTX-2](https://huggingface.co/papers/2601.03233) foundation
|
|
| 29 |
## What is In-Context LoRA (IC LoRA)?
|
| 30 |
|
| 31 |
IC LoRA enables conditioning video generation on reference video frames at inference time, allowing fine-grained video-to-video control on top of a text-to-video, base model.
|
| 32 |
-
It allows also the usage of an
|
| 33 |
|
| 34 |
-
## What is Reference
|
| 35 |
|
| 36 |
IC LoRA uses a reference control signal, i.e. a video that is positionally aligned to the generated video and contains the reference for context.
|
| 37 |
To allow for added efficiency, the reference video can be smaller, so it consumes less tokens.
|
|
@@ -51,10 +52,13 @@ See the **LTX-2-community-license** for full terms.
|
|
| 51 |
- **Base Model:** LTX-2-19b Video
|
| 52 |
- **Training Type:** IC LoRA
|
| 53 |
- **Control Type:** Union conditioning - Canny + Depth + Pose
|
|
|
|
| 54 |
|
| 55 |
### 🔌 Using in ComfyUI
|
| 56 |
1. Copy the LoRA weights into `models/loras`.
|
| 57 |
2. Use the official IC-LoRA workflow from the [LTX-2 ComfyUI repository](https://github.com/Lightricks/ComfyUI-LTXVideo/).
|
|
|
|
|
|
|
| 58 |
|
| 59 |
## Dataset
|
| 60 |
|
|
@@ -69,6 +73,12 @@ The model was trained using the [Lightricks/Canny-Control-Dataset](https://huggi
|
|
| 69 |
journal={arXiv preprint arXiv:2601.03233},
|
| 70 |
year={2025}
|
| 71 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
```
|
| 73 |
|
| 74 |
## Acknowledgments
|
|
|
|
| 7 |
- en
|
| 8 |
license: other
|
| 9 |
license_name: ltx-2-community-license
|
| 10 |
+
license_link: https://github.com/Lightricks/LTX-2/blob/main/LICENSE
|
| 11 |
pipeline_tag: any-to-any
|
| 12 |
tags:
|
| 13 |
- ltx-video
|
|
|
|
| 18 |
|
| 19 |
# LTX-2 19B IC-LoRA Union Control
|
| 20 |
|
| 21 |
+
This is a unified control IC-LoRA trained on top of **LTX-2-19b**, enabling multiple control signals to be used for video generation from text and reference frames.
|
| 22 |
+
It was trained with downscaled reference latents by a factor of 2.
|
| 23 |
|
| 24 |
It is based on the [LTX-2](https://huggingface.co/papers/2601.03233) foundation model.
|
| 25 |
|
|
|
|
| 30 |
## What is In-Context LoRA (IC LoRA)?
|
| 31 |
|
| 32 |
IC LoRA enables conditioning video generation on reference video frames at inference time, allowing fine-grained video-to-video control on top of a text-to-video, base model.
|
| 33 |
+
It allows also the usage of an initial image for image-to-video, and generate audio-visual output.
|
| 34 |
|
| 35 |
+
## What is Reference Downscale Factor?
|
| 36 |
|
| 37 |
IC LoRA uses a reference control signal, i.e. a video that is positionally aligned to the generated video and contains the reference for context.
|
| 38 |
To allow for added efficiency, the reference video can be smaller, so it consumes less tokens.
|
|
|
|
| 52 |
- **Base Model:** LTX-2-19b Video
|
| 53 |
- **Training Type:** IC LoRA
|
| 54 |
- **Control Type:** Union conditioning - Canny + Depth + Pose
|
| 55 |
+
- **Reference Downscale Factor:** 2 (reference resolution is 0.5x the output resolution)
|
| 56 |
|
| 57 |
### 🔌 Using in ComfyUI
|
| 58 |
1. Copy the LoRA weights into `models/loras`.
|
| 59 |
2. Use the official IC-LoRA workflow from the [LTX-2 ComfyUI repository](https://github.com/Lightricks/ComfyUI-LTXVideo/).
|
| 60 |
+
3. Make sure to use the nodes supporting Reference Downscale Factor: LTXICLoRALoaderModelOnly to load the lora and extract the downscale factor, and LTXAddVideoICLoRAGuide to add the small latent as a guide.
|
| 61 |
+
|
| 62 |
|
| 63 |
## Dataset
|
| 64 |
|
|
|
|
| 73 |
journal={arXiv preprint arXiv:2601.03233},
|
| 74 |
year={2025}
|
| 75 |
}
|
| 76 |
+
@misc{LTXVideoTrainer2025,
|
| 77 |
+
title={LTX-Video Community Trainer},
|
| 78 |
+
author={Matan Ben Yosef and Naomi Ken Korem and Tavi Halperin},
|
| 79 |
+
year={2025},
|
| 80 |
+
publisher={GitHub},
|
| 81 |
+
}
|
| 82 |
```
|
| 83 |
|
| 84 |
## Acknowledgments
|