|
|
--- |
|
|
base_model: |
|
|
- Lightricks/LTX-2 |
|
|
datasets: |
|
|
- Lightricks/Canny-Control-Dataset |
|
|
language: |
|
|
- en |
|
|
license: other |
|
|
license_name: ltx-2-community-license |
|
|
license_link: https://www.github.com/Lightricks/LTX-2/LICENSE |
|
|
pipeline_tag: any-to-any |
|
|
tags: |
|
|
- ltx-video |
|
|
- image-to-video |
|
|
- text-to-video |
|
|
pinned: true |
|
|
--- |
|
|
|
|
|
# LTX-2 19B IC-LoRA Canny Control |
|
|
|
|
|
This is a Canny control IC-LoRA trained on top of **LTX-2-19b**, enabling structure-preserving video generation from text and reference frames. |
|
|
|
|
|
It is based on the [LTX-2](https://huggingface.co/papers/2601.03233) foundation model. |
|
|
|
|
|
- **Paper:** [LTX-2: Efficient Joint Audio-Visual Foundation Model](https://huggingface.co/papers/2601.03233) |
|
|
- **Code:** [GitHub Repository](https://github.com/Lightricks/LTX-2) |
|
|
- **Project Page:** [LTX-2 Playground](https://app.ltx.studio/ltx-2-playground/i2v) |
|
|
|
|
|
## What is In-Context LoRA (IC LoRA)? |
|
|
|
|
|
IC LoRA enables conditioning video generation on reference video frames at inference time, allowing fine-grained video-to-video control on top of a text-to-video, base model. |
|
|
It allows also the usage of an intial image for image-to-video, and generate audio-visual output. |
|
|
|
|
|
## Model Files |
|
|
|
|
|
`ltx-2-19b-ic-lora-canny-control.safetensors` |
|
|
|
|
|
## License |
|
|
|
|
|
See the **LTX-2-community-license** for full terms. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model:** LTX-2-19b Video |
|
|
- **Training Type:** IC LoRA |
|
|
- **Control Type:** Canny edge conditioning |
|
|
|
|
|
### 🔌 Using in ComfyUI |
|
|
1. Copy the LoRA weights into `models/loras`. |
|
|
2. Use the official IC-LoRA workflow from the [LTX-2 ComfyUI repository](https://github.com/Lightricks/ComfyUI-LTXVideo/). |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The model was trained using the [Lightricks/Canny-Control-Dataset](https://huggingface.co/datasets/Lightricks/Canny-Control-Dataset/). |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{hacohen2025ltx2, |
|
|
title={LTX-2: Efficient Joint Audio-Visual Foundation Model}, |
|
|
author={HaCohen, Yoav and Brazowski, Benny and Chiprut, Nisan and Bitterman, Yaki and Kvochko, Andrew and Berkowitz, Avishai and Shalem, Daniel and Lifschitz, Daphna and Moshe, Dudu and Porat, Eitan and others}, |
|
|
journal={arXiv preprint arXiv:2601.03233}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Base model by **Lightricks** |
|
|
- Training infrastructure: **LTX-2 Community Trainer** |