ofirbibi commited on
Commit
7ea9334
·
1 Parent(s): 026e35e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -5
README.md CHANGED
@@ -1,5 +1,117 @@
1
- ---
2
- license: other
3
- license_name: ltx-2-open-weights-license
4
- license_link: https://static.lightricks.com/legal/ltx-2-open-weights-license-0.X.pdf
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: image-to-video
3
+ tags:
4
+ - image-to-video
5
+ - text-to-video
6
+ - video-to-video
7
+ - image-text-to-video
8
+ - audio-to-video
9
+ - text-to-audio
10
+ - video-to-audio
11
+ - audio-to-audio
12
+ - text-to-audio-video
13
+ - image-to-audio-video
14
+ - image-text-to-audio-video
15
+ - ltx-2
16
+ - ltx-video
17
+ - ltxv
18
+ - lightricks
19
+ pinned: true
20
+ language:
21
+ - en
22
+ - de
23
+ - es
24
+ - fr
25
+ - ja
26
+ - ko
27
+ - zh
28
+ - it
29
+ - pt
30
+ license: other
31
+ license_name: ltx-2-open-weights-license
32
+ license_link: https://static.lightricks.com/legal/ltx-2-open-weights-license-0.X.pdf
33
+ library_name: diffusers
34
+ demo: https://app.ltx.studio/ltx-2-playground/i2v
35
+ ---
36
+
37
+ # LTX-2 Model Card
38
+ This model card focuses on the LTX-2 model, codebase available [here](https://github.com/Lightricks/LTX-2).
39
+
40
+ LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model. It brings together the core building blocks of modern video generation, with open weights and a focus on practical, local execution.
41
+
42
+ <img src="./media/trailer.gif" alt="trailer" width="512">
43
+
44
+ # Model Checkpoints
45
+
46
+ | Name | Notes |
47
+ |--------------------------------|----------------------------------------------------------------------------------------------------------------|
48
+ | ltx-2-19b-dev | The full model, flexible and trainable in bf16 |
49
+ | ltx-2-19b-dev-fp8 | The full model in fp8 quantization |
50
+ | ltx-2-19b-dev-fp4 | The full model in nvfp4 quantization |
51
+ | ltx-2-19b-distilled | The distilled version of the full model, 8 steps, CFG=1 |
52
+ | ltx-2-19b-distilled-lora-384 | A LoRA version of the distilled model applicable to the full model |
53
+ | ltx-2-spatial-upscaler-x2-1.0 | An x2 spatial upscaler for the ltx-2 latents, used in multi stage (multiscale) pipelines for higher resolution |
54
+ | ltx-2-temporal-upscaler-x2-1.0 | An x2 temporal upscaler for the ltx-2 latents, used in multi stage (multiscale) pipelines for higher FPS |
55
+
56
+ ## Model Details
57
+ - **Developed by:** Lightricks
58
+ - **Model type:** Diffusion-based audio-video foundation model
59
+ - **Language(s):** English
60
+
61
+ # Online demo
62
+ LTX-2 is accessible right away via the following links:
63
+ - [LTX-Studio text-to-video](https://app.ltx.studio/ltx-2-playground/t2v)
64
+ - [LTX-Studio image-to-video](https://app.ltx.studio/ltx-2-playground/i2v)
65
+
66
+ # Run locally
67
+
68
+ ## Direct use license
69
+ You can use the models - full, distilled, upscalers and any derivatives of the models - for purposes under the [license](https://static.lightricks.com/legal/ltx-2-open-weights-license-0.X.pdf).
70
+
71
+ ## ComfyUI
72
+ We recommend you use the built-in LTXVideo nodes that can be found in the ComfyUI Manager.
73
+ For manual installation information, please refer to our [documentation site](https://docs.ltx.video/open-source-model/integration-tools/comfy-ui).
74
+
75
+ ## PyTorch codebase
76
+
77
+ The [LTX-2 codebase](https://github.com/Lightricks/LTX-2) is a monorepo with several packages. From model definition in 'ltx-core' to pipelines in 'ltx-pipelines' and training capabilities in 'ltx-trainer'.
78
+ The codebase was tested with Python >=3.12, CUDA version >12.7, and supports PyTorch ~= 2.7.
79
+
80
+ ### Installation
81
+
82
+ ```bash
83
+ git clone https://github.com/Lightricks/LTX-2.git
84
+ cd LTX-2
85
+
86
+ # From the repository root
87
+ uv sync
88
+ source .venv/bin/activate
89
+ ```
90
+
91
+ ### Inference
92
+
93
+ To use our model, please follow the instructions in our [ltx-pipelines](https://github.com/Lightricks/LTX-2/blob/main/packages/ltx-pipelines/README.md) package.
94
+
95
+ ## Diffusers 🧨
96
+
97
+ LTX-2 is supported in the [Diffusers Python library](https://huggingface.co/docs/diffusers/main/en/index) for image-to-video generation.
98
+
99
+ ## General tips:
100
+ * Width & height settings must be divisible by 32. Frame count must be divisible by 8 + 1.
101
+ * In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input should be padded with -1 and then cropped to the desired resolution and number of frames.
102
+ * For tips on writing effective prompts, please visit our [Prompting guide](https://ltx.video/blog/how-to-prompt-for-ltx-2)
103
+
104
+ ### Limitations
105
+ - This model is not intended or able to provide factual information.
106
+ - As a statistical model this checkpoint might amplify existing societal biases.
107
+ - The model may fail to generate videos that matches the prompts perfectly.
108
+ - Prompt following is heavily influenced by the prompting-style.
109
+ - The model may generate content that is inappropriate or offensive.
110
+ - When generating audio without speech, the audio may be of lower quality.
111
+
112
+ ## Image-to-video examples
113
+ | | | |
114
+ |:---:|:---:|:---:|
115
+ | ![example1](./media/ltx-video_i2v_example_00001.gif) | ![example2](./media/ltx-video_i2v_example_00002.gif) | ![example3](./media/ltx-video_i2v_example_00003.gif) |
116
+ | ![example4](./media/ltx-video_i2v_example_00004.gif) | ![example5](./media/ltx-video_i2v_example_00005.gif) | ![example6](./media/ltx-video_i2v_example_00006.gif) |
117
+ | ![example7](./media/ltx-video_i2v_example_00007.gif) | ![example8](./media/ltx-video_i2v_example_00008.gif) | ![example9](./media/ltx-video_i2v_example_00009.gif) |