vidfom commited on
Commit
14b57af
ยท
verified ยท
1 Parent(s): 2bf0b9a

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes. ย  See raw diff
Files changed (50) hide show
  1. .gitattributes +17 -0
  2. LTX-Video/.gitattributes +4 -0
  3. LTX-Video/.gitignore +166 -0
  4. LTX-Video/.pre-commit-config.yaml +16 -0
  5. LTX-Video/17433918265652166577684885641952.png +3 -0
  6. LTX-Video/LICENSE +201 -0
  7. LTX-Video/MODEL_DIR/.gitattributes +35 -0
  8. LTX-Video/MODEL_DIR/README.md +3 -0
  9. LTX-Video/MODEL_DIR/ltx-video-2b-v0.9.5.safetensors +3 -0
  10. LTX-Video/MODEL_DIR/model_index.json +24 -0
  11. LTX-Video/MODEL_DIR/scheduler/scheduler_config.json +16 -0
  12. LTX-Video/MODEL_DIR/t5xxl_fp16.safetensors +3 -0
  13. LTX-Video/MODEL_DIR/t5xxl_fp8_e4m3fn_scaled.safetensors +3 -0
  14. LTX-Video/MODEL_DIR/text_encoder/config.json +32 -0
  15. LTX-Video/MODEL_DIR/text_encoder/model-00001-of-00004.safetensors +3 -0
  16. LTX-Video/MODEL_DIR/text_encoder/model-00002-of-00004.safetensors +3 -0
  17. LTX-Video/MODEL_DIR/text_encoder/model-00003-of-00004.safetensors +3 -0
  18. LTX-Video/MODEL_DIR/text_encoder/model-00004-of-00004.safetensors +3 -0
  19. LTX-Video/MODEL_DIR/text_encoder/model.safetensors.index.json +226 -0
  20. LTX-Video/MODEL_DIR/tokenizer/added_tokens.json +102 -0
  21. LTX-Video/MODEL_DIR/tokenizer/special_tokens_map.json +125 -0
  22. LTX-Video/MODEL_DIR/tokenizer/spiece.model +3 -0
  23. LTX-Video/MODEL_DIR/tokenizer/tokenizer_config.json +940 -0
  24. LTX-Video/MODEL_DIR/transformer/config.json +19 -0
  25. LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model-00001-of-00002.safetensors +3 -0
  26. LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model-00002-of-00002.safetensors +3 -0
  27. LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model.safetensors.index.json +722 -0
  28. LTX-Video/MODEL_DIR/vae/config.json +32 -0
  29. LTX-Video/MODEL_DIR/vae/diffusion_pytorch_model.safetensors +3 -0
  30. LTX-Video/README.md +280 -0
  31. LTX-Video/__init__.py +0 -0
  32. LTX-Video/docs/_static/ltx-video_example_00001.gif +3 -0
  33. LTX-Video/docs/_static/ltx-video_example_00002.gif +3 -0
  34. LTX-Video/docs/_static/ltx-video_example_00003.gif +3 -0
  35. LTX-Video/docs/_static/ltx-video_example_00004.gif +3 -0
  36. LTX-Video/docs/_static/ltx-video_example_00005.gif +3 -0
  37. LTX-Video/docs/_static/ltx-video_example_00006.gif +3 -0
  38. LTX-Video/docs/_static/ltx-video_example_00007.gif +3 -0
  39. LTX-Video/docs/_static/ltx-video_example_00008.gif +3 -0
  40. LTX-Video/docs/_static/ltx-video_example_00009.gif +3 -0
  41. LTX-Video/docs/_static/ltx-video_example_00010.gif +3 -0
  42. LTX-Video/docs/_static/ltx-video_example_00011.gif +3 -0
  43. LTX-Video/docs/_static/ltx-video_example_00012.gif +3 -0
  44. LTX-Video/docs/_static/ltx-video_example_00013.gif +3 -0
  45. LTX-Video/docs/_static/ltx-video_example_00014.gif +3 -0
  46. LTX-Video/docs/_static/ltx-video_example_00015.gif +3 -0
  47. LTX-Video/docs/_static/ltx-video_example_00016.gif +3 -0
  48. LTX-Video/file_list.txt +46 -0
  49. LTX-Video/inference.py +758 -0
  50. LTX-Video/ltx_video.egg-info/PKG-INFO +305 -0
.gitattributes CHANGED
@@ -33,3 +33,20 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ LTX-Video/17433918265652166577684885641952.png filter=lfs diff=lfs merge=lfs -text
37
+ LTX-Video/docs/_static/ltx-video_example_00001.gif filter=lfs diff=lfs merge=lfs -text
38
+ LTX-Video/docs/_static/ltx-video_example_00002.gif filter=lfs diff=lfs merge=lfs -text
39
+ LTX-Video/docs/_static/ltx-video_example_00003.gif filter=lfs diff=lfs merge=lfs -text
40
+ LTX-Video/docs/_static/ltx-video_example_00004.gif filter=lfs diff=lfs merge=lfs -text
41
+ LTX-Video/docs/_static/ltx-video_example_00005.gif filter=lfs diff=lfs merge=lfs -text
42
+ LTX-Video/docs/_static/ltx-video_example_00006.gif filter=lfs diff=lfs merge=lfs -text
43
+ LTX-Video/docs/_static/ltx-video_example_00007.gif filter=lfs diff=lfs merge=lfs -text
44
+ LTX-Video/docs/_static/ltx-video_example_00008.gif filter=lfs diff=lfs merge=lfs -text
45
+ LTX-Video/docs/_static/ltx-video_example_00009.gif filter=lfs diff=lfs merge=lfs -text
46
+ LTX-Video/docs/_static/ltx-video_example_00010.gif filter=lfs diff=lfs merge=lfs -text
47
+ LTX-Video/docs/_static/ltx-video_example_00011.gif filter=lfs diff=lfs merge=lfs -text
48
+ LTX-Video/docs/_static/ltx-video_example_00012.gif filter=lfs diff=lfs merge=lfs -text
49
+ LTX-Video/docs/_static/ltx-video_example_00013.gif filter=lfs diff=lfs merge=lfs -text
50
+ LTX-Video/docs/_static/ltx-video_example_00014.gif filter=lfs diff=lfs merge=lfs -text
51
+ LTX-Video/docs/_static/ltx-video_example_00015.gif filter=lfs diff=lfs merge=lfs -text
52
+ LTX-Video/docs/_static/ltx-video_example_00016.gif filter=lfs diff=lfs merge=lfs -text
LTX-Video/.gitattributes ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ *.jpg filter=lfs diff=lfs merge=lfs -text
2
+ *.jpeg filter=lfs diff=lfs merge=lfs -text
3
+ *.png filter=lfs diff=lfs merge=lfs -text
4
+ *.gif filter=lfs diff=lfs merge=lfs -text
LTX-Video/.gitignore ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ share/python-wheels/
24
+ *.egg-info/
25
+ .installed.cfg
26
+ *.egg
27
+ MANIFEST
28
+
29
+ # PyInstaller
30
+ # Usually these files are written by a python script from a template
31
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
32
+ *.manifest
33
+ *.spec
34
+
35
+ # Installer logs
36
+ pip-log.txt
37
+ pip-delete-this-directory.txt
38
+
39
+ # Unit test / coverage reports
40
+ htmlcov/
41
+ .tox/
42
+ .nox/
43
+ .coverage
44
+ .coverage.*
45
+ .cache
46
+ nosetests.xml
47
+ coverage.xml
48
+ *.cover
49
+ *.py,cover
50
+ .hypothesis/
51
+ .pytest_cache/
52
+ cover/
53
+
54
+ # Translations
55
+ *.mo
56
+ *.pot
57
+
58
+ # Django stuff:
59
+ *.log
60
+ local_settings.py
61
+ db.sqlite3
62
+ db.sqlite3-journal
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ .pybuilder/
76
+ target/
77
+
78
+ # Jupyter Notebook
79
+ .ipynb_checkpoints
80
+
81
+ # IPython
82
+ profile_default/
83
+ ipython_config.py
84
+
85
+ # pyenv
86
+ # For a library or package, you might want to ignore these files since the code is
87
+ # intended to run in multiple environments; otherwise, check them in:
88
+ # .python-version
89
+
90
+ # pipenv
91
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
93
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
94
+ # install all needed dependencies.
95
+ #Pipfile.lock
96
+
97
+ # poetry
98
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
100
+ # commonly ignored for libraries.
101
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102
+ #poetry.lock
103
+
104
+ # pdm
105
+ # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106
+ #pdm.lock
107
+ # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108
+ # in version control.
109
+ # https://pdm.fming.dev/latest/usage/project/#working-with-version-control
110
+ .pdm.toml
111
+ .pdm-python
112
+ .pdm-build/
113
+
114
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
115
+ __pypackages__/
116
+
117
+ # Celery stuff
118
+ celerybeat-schedule
119
+ celerybeat.pid
120
+
121
+ # SageMath parsed files
122
+ *.sage.py
123
+
124
+ # Environments
125
+ .env
126
+ .venv
127
+ env/
128
+ venv/
129
+ ENV/
130
+ env.bak/
131
+ venv.bak/
132
+
133
+ # Spyder project settings
134
+ .spyderproject
135
+ .spyproject
136
+
137
+ # Rope project settings
138
+ .ropeproject
139
+
140
+ # mkdocs documentation
141
+ /site
142
+
143
+ # mypy
144
+ .mypy_cache/
145
+ .dmypy.json
146
+ dmypy.json
147
+
148
+ # Pyre type checker
149
+ .pyre/
150
+
151
+ # pytype static type analyzer
152
+ .pytype/
153
+
154
+ # Cython debug symbols
155
+ cython_debug/
156
+
157
+ # PyCharm
158
+ # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
159
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
160
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
161
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
162
+ .idea/
163
+
164
+ # From inference.py
165
+ outputs/
166
+ video_output_*.mp4
LTX-Video/.pre-commit-config.yaml ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ repos:
2
+ - repo: https://github.com/astral-sh/ruff-pre-commit
3
+ # Ruff version.
4
+ rev: v0.2.2
5
+ hooks:
6
+ # Run the linter.
7
+ - id: ruff
8
+ args: [--fix] # Automatically fix issues if possible.
9
+ types: [python] # Ensure it only runs on .py files.
10
+
11
+ - repo: https://github.com/psf/black
12
+ rev: 24.2.0 # Specify the version of Black you want
13
+ hooks:
14
+ - id: black
15
+ name: Black code formatter
16
+ language_version: python3 # Use the Python version you're targeting (e.g., 3.10)
LTX-Video/17433918265652166577684885641952.png ADDED

Git LFS Details

  • SHA256: bd153243d3794da5ad552ba853927d1a6a389d897032577272eb5dadaa60871d
  • Pointer size: 132 Bytes
  • Size of remote file: 3.14 MB
LTX-Video/LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
LTX-Video/MODEL_DIR/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
LTX-Video/MODEL_DIR/README.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
LTX-Video/MODEL_DIR/ltx-video-2b-v0.9.5.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:720d15c9f19f7d0f6b2a92bbbc34410e2cfb2f6856a100b38f734fbf973d4adf
3
+ size 6340729500
LTX-Video/MODEL_DIR/model_index.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "LTXPipeline",
3
+ "_diffusers_version": "0.32.0.dev0",
4
+ "scheduler": [
5
+ "diffusers",
6
+ "FlowMatchEulerDiscreteScheduler"
7
+ ],
8
+ "text_encoder": [
9
+ "transformers",
10
+ "T5EncoderModel"
11
+ ],
12
+ "tokenizer": [
13
+ "transformers",
14
+ "T5Tokenizer"
15
+ ],
16
+ "transformer": [
17
+ "diffusers",
18
+ "LTXVideoTransformer3DModel"
19
+ ],
20
+ "vae": [
21
+ "diffusers",
22
+ "AutoencoderKLLTXVideo"
23
+ ]
24
+ }
LTX-Video/MODEL_DIR/scheduler/scheduler_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "FlowMatchEulerDiscreteScheduler",
3
+ "_diffusers_version": "0.32.0.dev0",
4
+ "base_image_seq_len": 1024,
5
+ "base_shift": 0.95,
6
+ "invert_sigmas": false,
7
+ "max_image_seq_len": 4096,
8
+ "max_shift": 2.05,
9
+ "num_train_timesteps": 1000,
10
+ "shift": 1.0,
11
+ "shift_terminal": 0.1,
12
+ "use_beta_sigmas": false,
13
+ "use_dynamic_shifting": true,
14
+ "use_exponential_sigmas": false,
15
+ "use_karras_sigmas": false
16
+ }
LTX-Video/MODEL_DIR/t5xxl_fp16.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
3
+ size 9787841024
LTX-Video/MODEL_DIR/t5xxl_fp8_e4m3fn_scaled.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a498f0485dc9536735258018417c3fd7758dc3bccc0a645feaa472b34955557a
3
+ size 5157348688
LTX-Video/MODEL_DIR/text_encoder/config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "google/t5-v1_1-xxl",
3
+ "architectures": [
4
+ "T5EncoderModel"
5
+ ],
6
+ "classifier_dropout": 0.0,
7
+ "d_ff": 10240,
8
+ "d_kv": 64,
9
+ "d_model": 4096,
10
+ "decoder_start_token_id": 0,
11
+ "dense_act_fn": "gelu_new",
12
+ "dropout_rate": 0.1,
13
+ "eos_token_id": 1,
14
+ "feed_forward_proj": "gated-gelu",
15
+ "initializer_factor": 1.0,
16
+ "is_encoder_decoder": true,
17
+ "is_gated_act": true,
18
+ "layer_norm_epsilon": 1e-06,
19
+ "model_type": "t5",
20
+ "num_decoder_layers": 24,
21
+ "num_heads": 64,
22
+ "num_layers": 24,
23
+ "output_past": true,
24
+ "pad_token_id": 0,
25
+ "relative_attention_max_distance": 128,
26
+ "relative_attention_num_buckets": 32,
27
+ "tie_word_embeddings": false,
28
+ "torch_dtype": "float32",
29
+ "transformers_version": "4.46.2",
30
+ "use_cache": true,
31
+ "vocab_size": 32128
32
+ }
LTX-Video/MODEL_DIR/text_encoder/model-00001-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a68b2c8c080696a10109612a649bc69330991ecfea65930ccfdfbdb011f2686
3
+ size 4989319680
LTX-Video/MODEL_DIR/text_encoder/model-00002-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8ed6556d7507e38af5b428c605fb2a6f2bdb7e80bd481308b865f7a40c551ca
3
+ size 4999830656
LTX-Video/MODEL_DIR/text_encoder/model-00003-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c831635f83041f83faf0024b39c6ecb21b45d70dd38a63ea5bac6c7c6e5e558c
3
+ size 4865612720
LTX-Video/MODEL_DIR/text_encoder/model-00004-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:02a5f2d69205be92ad48fe5d712d38c2ff55627969116aeffc58bd75a28da468
3
+ size 4194506688
LTX-Video/MODEL_DIR/text_encoder/model.safetensors.index.json ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 19049242624
4
+ },
5
+ "weight_map": {
6
+ "encoder.block.0.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
7
+ "encoder.block.0.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
8
+ "encoder.block.0.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
9
+ "encoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight": "model-00001-of-00004.safetensors",
10
+ "encoder.block.0.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
11
+ "encoder.block.0.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
12
+ "encoder.block.0.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
13
+ "encoder.block.0.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
14
+ "encoder.block.0.layer.1.DenseReluDense.wo.weight": "model-00001-of-00004.safetensors",
15
+ "encoder.block.0.layer.1.layer_norm.weight": "model-00001-of-00004.safetensors",
16
+ "encoder.block.1.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
17
+ "encoder.block.1.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
18
+ "encoder.block.1.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
19
+ "encoder.block.1.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
20
+ "encoder.block.1.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
21
+ "encoder.block.1.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
22
+ "encoder.block.1.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
23
+ "encoder.block.1.layer.1.DenseReluDense.wo.weight": "model-00001-of-00004.safetensors",
24
+ "encoder.block.1.layer.1.layer_norm.weight": "model-00001-of-00004.safetensors",
25
+ "encoder.block.10.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
26
+ "encoder.block.10.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
27
+ "encoder.block.10.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
28
+ "encoder.block.10.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
29
+ "encoder.block.10.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
30
+ "encoder.block.10.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
31
+ "encoder.block.10.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
32
+ "encoder.block.10.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
33
+ "encoder.block.10.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
34
+ "encoder.block.11.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
35
+ "encoder.block.11.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
36
+ "encoder.block.11.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
37
+ "encoder.block.11.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
38
+ "encoder.block.11.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
39
+ "encoder.block.11.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
40
+ "encoder.block.11.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
41
+ "encoder.block.11.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
42
+ "encoder.block.11.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
43
+ "encoder.block.12.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
44
+ "encoder.block.12.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
45
+ "encoder.block.12.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
46
+ "encoder.block.12.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
47
+ "encoder.block.12.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
48
+ "encoder.block.12.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
49
+ "encoder.block.12.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
50
+ "encoder.block.12.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
51
+ "encoder.block.12.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
52
+ "encoder.block.13.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
53
+ "encoder.block.13.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
54
+ "encoder.block.13.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
55
+ "encoder.block.13.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
56
+ "encoder.block.13.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
57
+ "encoder.block.13.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
58
+ "encoder.block.13.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
59
+ "encoder.block.13.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
60
+ "encoder.block.13.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
61
+ "encoder.block.14.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
62
+ "encoder.block.14.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
63
+ "encoder.block.14.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
64
+ "encoder.block.14.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
65
+ "encoder.block.14.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
66
+ "encoder.block.14.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
67
+ "encoder.block.14.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
68
+ "encoder.block.14.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
69
+ "encoder.block.14.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
70
+ "encoder.block.15.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
71
+ "encoder.block.15.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
72
+ "encoder.block.15.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
73
+ "encoder.block.15.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
74
+ "encoder.block.15.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
75
+ "encoder.block.15.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
76
+ "encoder.block.15.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
77
+ "encoder.block.15.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
78
+ "encoder.block.15.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
79
+ "encoder.block.16.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
80
+ "encoder.block.16.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
81
+ "encoder.block.16.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
82
+ "encoder.block.16.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
83
+ "encoder.block.16.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
84
+ "encoder.block.16.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
85
+ "encoder.block.16.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
86
+ "encoder.block.16.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
87
+ "encoder.block.16.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
88
+ "encoder.block.17.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
89
+ "encoder.block.17.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
90
+ "encoder.block.17.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
91
+ "encoder.block.17.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
92
+ "encoder.block.17.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
93
+ "encoder.block.17.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
94
+ "encoder.block.17.layer.1.DenseReluDense.wi_1.weight": "model-00003-of-00004.safetensors",
95
+ "encoder.block.17.layer.1.DenseReluDense.wo.weight": "model-00003-of-00004.safetensors",
96
+ "encoder.block.17.layer.1.layer_norm.weight": "model-00003-of-00004.safetensors",
97
+ "encoder.block.18.layer.0.SelfAttention.k.weight": "model-00003-of-00004.safetensors",
98
+ "encoder.block.18.layer.0.SelfAttention.o.weight": "model-00003-of-00004.safetensors",
99
+ "encoder.block.18.layer.0.SelfAttention.q.weight": "model-00003-of-00004.safetensors",
100
+ "encoder.block.18.layer.0.SelfAttention.v.weight": "model-00003-of-00004.safetensors",
101
+ "encoder.block.18.layer.0.layer_norm.weight": "model-00003-of-00004.safetensors",
102
+ "encoder.block.18.layer.1.DenseReluDense.wi_0.weight": "model-00003-of-00004.safetensors",
103
+ "encoder.block.18.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
104
+ "encoder.block.18.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
105
+ "encoder.block.18.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
106
+ "encoder.block.19.layer.0.SelfAttention.k.weight": "model-00004-of-00004.safetensors",
107
+ "encoder.block.19.layer.0.SelfAttention.o.weight": "model-00004-of-00004.safetensors",
108
+ "encoder.block.19.layer.0.SelfAttention.q.weight": "model-00004-of-00004.safetensors",
109
+ "encoder.block.19.layer.0.SelfAttention.v.weight": "model-00004-of-00004.safetensors",
110
+ "encoder.block.19.layer.0.layer_norm.weight": "model-00004-of-00004.safetensors",
111
+ "encoder.block.19.layer.1.DenseReluDense.wi_0.weight": "model-00004-of-00004.safetensors",
112
+ "encoder.block.19.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
113
+ "encoder.block.19.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
114
+ "encoder.block.19.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
115
+ "encoder.block.2.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
116
+ "encoder.block.2.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
117
+ "encoder.block.2.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
118
+ "encoder.block.2.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
119
+ "encoder.block.2.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
120
+ "encoder.block.2.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
121
+ "encoder.block.2.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
122
+ "encoder.block.2.layer.1.DenseReluDense.wo.weight": "model-00001-of-00004.safetensors",
123
+ "encoder.block.2.layer.1.layer_norm.weight": "model-00001-of-00004.safetensors",
124
+ "encoder.block.20.layer.0.SelfAttention.k.weight": "model-00004-of-00004.safetensors",
125
+ "encoder.block.20.layer.0.SelfAttention.o.weight": "model-00004-of-00004.safetensors",
126
+ "encoder.block.20.layer.0.SelfAttention.q.weight": "model-00004-of-00004.safetensors",
127
+ "encoder.block.20.layer.0.SelfAttention.v.weight": "model-00004-of-00004.safetensors",
128
+ "encoder.block.20.layer.0.layer_norm.weight": "model-00004-of-00004.safetensors",
129
+ "encoder.block.20.layer.1.DenseReluDense.wi_0.weight": "model-00004-of-00004.safetensors",
130
+ "encoder.block.20.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
131
+ "encoder.block.20.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
132
+ "encoder.block.20.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
133
+ "encoder.block.21.layer.0.SelfAttention.k.weight": "model-00004-of-00004.safetensors",
134
+ "encoder.block.21.layer.0.SelfAttention.o.weight": "model-00004-of-00004.safetensors",
135
+ "encoder.block.21.layer.0.SelfAttention.q.weight": "model-00004-of-00004.safetensors",
136
+ "encoder.block.21.layer.0.SelfAttention.v.weight": "model-00004-of-00004.safetensors",
137
+ "encoder.block.21.layer.0.layer_norm.weight": "model-00004-of-00004.safetensors",
138
+ "encoder.block.21.layer.1.DenseReluDense.wi_0.weight": "model-00004-of-00004.safetensors",
139
+ "encoder.block.21.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
140
+ "encoder.block.21.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
141
+ "encoder.block.21.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
142
+ "encoder.block.22.layer.0.SelfAttention.k.weight": "model-00004-of-00004.safetensors",
143
+ "encoder.block.22.layer.0.SelfAttention.o.weight": "model-00004-of-00004.safetensors",
144
+ "encoder.block.22.layer.0.SelfAttention.q.weight": "model-00004-of-00004.safetensors",
145
+ "encoder.block.22.layer.0.SelfAttention.v.weight": "model-00004-of-00004.safetensors",
146
+ "encoder.block.22.layer.0.layer_norm.weight": "model-00004-of-00004.safetensors",
147
+ "encoder.block.22.layer.1.DenseReluDense.wi_0.weight": "model-00004-of-00004.safetensors",
148
+ "encoder.block.22.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
149
+ "encoder.block.22.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
150
+ "encoder.block.22.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
151
+ "encoder.block.23.layer.0.SelfAttention.k.weight": "model-00004-of-00004.safetensors",
152
+ "encoder.block.23.layer.0.SelfAttention.o.weight": "model-00004-of-00004.safetensors",
153
+ "encoder.block.23.layer.0.SelfAttention.q.weight": "model-00004-of-00004.safetensors",
154
+ "encoder.block.23.layer.0.SelfAttention.v.weight": "model-00004-of-00004.safetensors",
155
+ "encoder.block.23.layer.0.layer_norm.weight": "model-00004-of-00004.safetensors",
156
+ "encoder.block.23.layer.1.DenseReluDense.wi_0.weight": "model-00004-of-00004.safetensors",
157
+ "encoder.block.23.layer.1.DenseReluDense.wi_1.weight": "model-00004-of-00004.safetensors",
158
+ "encoder.block.23.layer.1.DenseReluDense.wo.weight": "model-00004-of-00004.safetensors",
159
+ "encoder.block.23.layer.1.layer_norm.weight": "model-00004-of-00004.safetensors",
160
+ "encoder.block.3.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
161
+ "encoder.block.3.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
162
+ "encoder.block.3.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
163
+ "encoder.block.3.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
164
+ "encoder.block.3.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
165
+ "encoder.block.3.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
166
+ "encoder.block.3.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
167
+ "encoder.block.3.layer.1.DenseReluDense.wo.weight": "model-00001-of-00004.safetensors",
168
+ "encoder.block.3.layer.1.layer_norm.weight": "model-00001-of-00004.safetensors",
169
+ "encoder.block.4.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
170
+ "encoder.block.4.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
171
+ "encoder.block.4.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
172
+ "encoder.block.4.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
173
+ "encoder.block.4.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
174
+ "encoder.block.4.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
175
+ "encoder.block.4.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
176
+ "encoder.block.4.layer.1.DenseReluDense.wo.weight": "model-00001-of-00004.safetensors",
177
+ "encoder.block.4.layer.1.layer_norm.weight": "model-00001-of-00004.safetensors",
178
+ "encoder.block.5.layer.0.SelfAttention.k.weight": "model-00001-of-00004.safetensors",
179
+ "encoder.block.5.layer.0.SelfAttention.o.weight": "model-00001-of-00004.safetensors",
180
+ "encoder.block.5.layer.0.SelfAttention.q.weight": "model-00001-of-00004.safetensors",
181
+ "encoder.block.5.layer.0.SelfAttention.v.weight": "model-00001-of-00004.safetensors",
182
+ "encoder.block.5.layer.0.layer_norm.weight": "model-00001-of-00004.safetensors",
183
+ "encoder.block.5.layer.1.DenseReluDense.wi_0.weight": "model-00001-of-00004.safetensors",
184
+ "encoder.block.5.layer.1.DenseReluDense.wi_1.weight": "model-00001-of-00004.safetensors",
185
+ "encoder.block.5.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
186
+ "encoder.block.5.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
187
+ "encoder.block.6.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
188
+ "encoder.block.6.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
189
+ "encoder.block.6.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
190
+ "encoder.block.6.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
191
+ "encoder.block.6.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
192
+ "encoder.block.6.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
193
+ "encoder.block.6.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
194
+ "encoder.block.6.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
195
+ "encoder.block.6.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
196
+ "encoder.block.7.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
197
+ "encoder.block.7.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
198
+ "encoder.block.7.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
199
+ "encoder.block.7.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
200
+ "encoder.block.7.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
201
+ "encoder.block.7.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
202
+ "encoder.block.7.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
203
+ "encoder.block.7.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
204
+ "encoder.block.7.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
205
+ "encoder.block.8.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
206
+ "encoder.block.8.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
207
+ "encoder.block.8.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
208
+ "encoder.block.8.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
209
+ "encoder.block.8.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
210
+ "encoder.block.8.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
211
+ "encoder.block.8.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
212
+ "encoder.block.8.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
213
+ "encoder.block.8.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
214
+ "encoder.block.9.layer.0.SelfAttention.k.weight": "model-00002-of-00004.safetensors",
215
+ "encoder.block.9.layer.0.SelfAttention.o.weight": "model-00002-of-00004.safetensors",
216
+ "encoder.block.9.layer.0.SelfAttention.q.weight": "model-00002-of-00004.safetensors",
217
+ "encoder.block.9.layer.0.SelfAttention.v.weight": "model-00002-of-00004.safetensors",
218
+ "encoder.block.9.layer.0.layer_norm.weight": "model-00002-of-00004.safetensors",
219
+ "encoder.block.9.layer.1.DenseReluDense.wi_0.weight": "model-00002-of-00004.safetensors",
220
+ "encoder.block.9.layer.1.DenseReluDense.wi_1.weight": "model-00002-of-00004.safetensors",
221
+ "encoder.block.9.layer.1.DenseReluDense.wo.weight": "model-00002-of-00004.safetensors",
222
+ "encoder.block.9.layer.1.layer_norm.weight": "model-00002-of-00004.safetensors",
223
+ "encoder.final_layer_norm.weight": "model-00004-of-00004.safetensors",
224
+ "shared.weight": "model-00001-of-00004.safetensors"
225
+ }
226
+ }
LTX-Video/MODEL_DIR/tokenizer/added_tokens.json ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "<extra_id_0>": 32099,
3
+ "<extra_id_10>": 32089,
4
+ "<extra_id_11>": 32088,
5
+ "<extra_id_12>": 32087,
6
+ "<extra_id_13>": 32086,
7
+ "<extra_id_14>": 32085,
8
+ "<extra_id_15>": 32084,
9
+ "<extra_id_16>": 32083,
10
+ "<extra_id_17>": 32082,
11
+ "<extra_id_18>": 32081,
12
+ "<extra_id_19>": 32080,
13
+ "<extra_id_1>": 32098,
14
+ "<extra_id_20>": 32079,
15
+ "<extra_id_21>": 32078,
16
+ "<extra_id_22>": 32077,
17
+ "<extra_id_23>": 32076,
18
+ "<extra_id_24>": 32075,
19
+ "<extra_id_25>": 32074,
20
+ "<extra_id_26>": 32073,
21
+ "<extra_id_27>": 32072,
22
+ "<extra_id_28>": 32071,
23
+ "<extra_id_29>": 32070,
24
+ "<extra_id_2>": 32097,
25
+ "<extra_id_30>": 32069,
26
+ "<extra_id_31>": 32068,
27
+ "<extra_id_32>": 32067,
28
+ "<extra_id_33>": 32066,
29
+ "<extra_id_34>": 32065,
30
+ "<extra_id_35>": 32064,
31
+ "<extra_id_36>": 32063,
32
+ "<extra_id_37>": 32062,
33
+ "<extra_id_38>": 32061,
34
+ "<extra_id_39>": 32060,
35
+ "<extra_id_3>": 32096,
36
+ "<extra_id_40>": 32059,
37
+ "<extra_id_41>": 32058,
38
+ "<extra_id_42>": 32057,
39
+ "<extra_id_43>": 32056,
40
+ "<extra_id_44>": 32055,
41
+ "<extra_id_45>": 32054,
42
+ "<extra_id_46>": 32053,
43
+ "<extra_id_47>": 32052,
44
+ "<extra_id_48>": 32051,
45
+ "<extra_id_49>": 32050,
46
+ "<extra_id_4>": 32095,
47
+ "<extra_id_50>": 32049,
48
+ "<extra_id_51>": 32048,
49
+ "<extra_id_52>": 32047,
50
+ "<extra_id_53>": 32046,
51
+ "<extra_id_54>": 32045,
52
+ "<extra_id_55>": 32044,
53
+ "<extra_id_56>": 32043,
54
+ "<extra_id_57>": 32042,
55
+ "<extra_id_58>": 32041,
56
+ "<extra_id_59>": 32040,
57
+ "<extra_id_5>": 32094,
58
+ "<extra_id_60>": 32039,
59
+ "<extra_id_61>": 32038,
60
+ "<extra_id_62>": 32037,
61
+ "<extra_id_63>": 32036,
62
+ "<extra_id_64>": 32035,
63
+ "<extra_id_65>": 32034,
64
+ "<extra_id_66>": 32033,
65
+ "<extra_id_67>": 32032,
66
+ "<extra_id_68>": 32031,
67
+ "<extra_id_69>": 32030,
68
+ "<extra_id_6>": 32093,
69
+ "<extra_id_70>": 32029,
70
+ "<extra_id_71>": 32028,
71
+ "<extra_id_72>": 32027,
72
+ "<extra_id_73>": 32026,
73
+ "<extra_id_74>": 32025,
74
+ "<extra_id_75>": 32024,
75
+ "<extra_id_76>": 32023,
76
+ "<extra_id_77>": 32022,
77
+ "<extra_id_78>": 32021,
78
+ "<extra_id_79>": 32020,
79
+ "<extra_id_7>": 32092,
80
+ "<extra_id_80>": 32019,
81
+ "<extra_id_81>": 32018,
82
+ "<extra_id_82>": 32017,
83
+ "<extra_id_83>": 32016,
84
+ "<extra_id_84>": 32015,
85
+ "<extra_id_85>": 32014,
86
+ "<extra_id_86>": 32013,
87
+ "<extra_id_87>": 32012,
88
+ "<extra_id_88>": 32011,
89
+ "<extra_id_89>": 32010,
90
+ "<extra_id_8>": 32091,
91
+ "<extra_id_90>": 32009,
92
+ "<extra_id_91>": 32008,
93
+ "<extra_id_92>": 32007,
94
+ "<extra_id_93>": 32006,
95
+ "<extra_id_94>": 32005,
96
+ "<extra_id_95>": 32004,
97
+ "<extra_id_96>": 32003,
98
+ "<extra_id_97>": 32002,
99
+ "<extra_id_98>": 32001,
100
+ "<extra_id_99>": 32000,
101
+ "<extra_id_9>": 32090
102
+ }
LTX-Video/MODEL_DIR/tokenizer/special_tokens_map.json ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<extra_id_0>",
4
+ "<extra_id_1>",
5
+ "<extra_id_2>",
6
+ "<extra_id_3>",
7
+ "<extra_id_4>",
8
+ "<extra_id_5>",
9
+ "<extra_id_6>",
10
+ "<extra_id_7>",
11
+ "<extra_id_8>",
12
+ "<extra_id_9>",
13
+ "<extra_id_10>",
14
+ "<extra_id_11>",
15
+ "<extra_id_12>",
16
+ "<extra_id_13>",
17
+ "<extra_id_14>",
18
+ "<extra_id_15>",
19
+ "<extra_id_16>",
20
+ "<extra_id_17>",
21
+ "<extra_id_18>",
22
+ "<extra_id_19>",
23
+ "<extra_id_20>",
24
+ "<extra_id_21>",
25
+ "<extra_id_22>",
26
+ "<extra_id_23>",
27
+ "<extra_id_24>",
28
+ "<extra_id_25>",
29
+ "<extra_id_26>",
30
+ "<extra_id_27>",
31
+ "<extra_id_28>",
32
+ "<extra_id_29>",
33
+ "<extra_id_30>",
34
+ "<extra_id_31>",
35
+ "<extra_id_32>",
36
+ "<extra_id_33>",
37
+ "<extra_id_34>",
38
+ "<extra_id_35>",
39
+ "<extra_id_36>",
40
+ "<extra_id_37>",
41
+ "<extra_id_38>",
42
+ "<extra_id_39>",
43
+ "<extra_id_40>",
44
+ "<extra_id_41>",
45
+ "<extra_id_42>",
46
+ "<extra_id_43>",
47
+ "<extra_id_44>",
48
+ "<extra_id_45>",
49
+ "<extra_id_46>",
50
+ "<extra_id_47>",
51
+ "<extra_id_48>",
52
+ "<extra_id_49>",
53
+ "<extra_id_50>",
54
+ "<extra_id_51>",
55
+ "<extra_id_52>",
56
+ "<extra_id_53>",
57
+ "<extra_id_54>",
58
+ "<extra_id_55>",
59
+ "<extra_id_56>",
60
+ "<extra_id_57>",
61
+ "<extra_id_58>",
62
+ "<extra_id_59>",
63
+ "<extra_id_60>",
64
+ "<extra_id_61>",
65
+ "<extra_id_62>",
66
+ "<extra_id_63>",
67
+ "<extra_id_64>",
68
+ "<extra_id_65>",
69
+ "<extra_id_66>",
70
+ "<extra_id_67>",
71
+ "<extra_id_68>",
72
+ "<extra_id_69>",
73
+ "<extra_id_70>",
74
+ "<extra_id_71>",
75
+ "<extra_id_72>",
76
+ "<extra_id_73>",
77
+ "<extra_id_74>",
78
+ "<extra_id_75>",
79
+ "<extra_id_76>",
80
+ "<extra_id_77>",
81
+ "<extra_id_78>",
82
+ "<extra_id_79>",
83
+ "<extra_id_80>",
84
+ "<extra_id_81>",
85
+ "<extra_id_82>",
86
+ "<extra_id_83>",
87
+ "<extra_id_84>",
88
+ "<extra_id_85>",
89
+ "<extra_id_86>",
90
+ "<extra_id_87>",
91
+ "<extra_id_88>",
92
+ "<extra_id_89>",
93
+ "<extra_id_90>",
94
+ "<extra_id_91>",
95
+ "<extra_id_92>",
96
+ "<extra_id_93>",
97
+ "<extra_id_94>",
98
+ "<extra_id_95>",
99
+ "<extra_id_96>",
100
+ "<extra_id_97>",
101
+ "<extra_id_98>",
102
+ "<extra_id_99>"
103
+ ],
104
+ "eos_token": {
105
+ "content": "</s>",
106
+ "lstrip": false,
107
+ "normalized": false,
108
+ "rstrip": false,
109
+ "single_word": false
110
+ },
111
+ "pad_token": {
112
+ "content": "<pad>",
113
+ "lstrip": false,
114
+ "normalized": false,
115
+ "rstrip": false,
116
+ "single_word": false
117
+ },
118
+ "unk_token": {
119
+ "content": "<unk>",
120
+ "lstrip": false,
121
+ "normalized": false,
122
+ "rstrip": false,
123
+ "single_word": false
124
+ }
125
+ }
LTX-Video/MODEL_DIR/tokenizer/spiece.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d60acb128cf7b7f2536e8f38a5b18a05535c9e14c7a355904270e15b0945ea86
3
+ size 791656
LTX-Video/MODEL_DIR/tokenizer/tokenizer_config.json ADDED
@@ -0,0 +1,940 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<pad>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "</s>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "<unk>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "32000": {
29
+ "content": "<extra_id_99>",
30
+ "lstrip": true,
31
+ "normalized": false,
32
+ "rstrip": true,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "32001": {
37
+ "content": "<extra_id_98>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": true,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "32002": {
45
+ "content": "<extra_id_97>",
46
+ "lstrip": true,
47
+ "normalized": false,
48
+ "rstrip": true,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "32003": {
53
+ "content": "<extra_id_96>",
54
+ "lstrip": true,
55
+ "normalized": false,
56
+ "rstrip": true,
57
+ "single_word": false,
58
+ "special": true
59
+ },
60
+ "32004": {
61
+ "content": "<extra_id_95>",
62
+ "lstrip": true,
63
+ "normalized": false,
64
+ "rstrip": true,
65
+ "single_word": false,
66
+ "special": true
67
+ },
68
+ "32005": {
69
+ "content": "<extra_id_94>",
70
+ "lstrip": true,
71
+ "normalized": false,
72
+ "rstrip": true,
73
+ "single_word": false,
74
+ "special": true
75
+ },
76
+ "32006": {
77
+ "content": "<extra_id_93>",
78
+ "lstrip": true,
79
+ "normalized": false,
80
+ "rstrip": true,
81
+ "single_word": false,
82
+ "special": true
83
+ },
84
+ "32007": {
85
+ "content": "<extra_id_92>",
86
+ "lstrip": true,
87
+ "normalized": false,
88
+ "rstrip": true,
89
+ "single_word": false,
90
+ "special": true
91
+ },
92
+ "32008": {
93
+ "content": "<extra_id_91>",
94
+ "lstrip": true,
95
+ "normalized": false,
96
+ "rstrip": true,
97
+ "single_word": false,
98
+ "special": true
99
+ },
100
+ "32009": {
101
+ "content": "<extra_id_90>",
102
+ "lstrip": true,
103
+ "normalized": false,
104
+ "rstrip": true,
105
+ "single_word": false,
106
+ "special": true
107
+ },
108
+ "32010": {
109
+ "content": "<extra_id_89>",
110
+ "lstrip": true,
111
+ "normalized": false,
112
+ "rstrip": true,
113
+ "single_word": false,
114
+ "special": true
115
+ },
116
+ "32011": {
117
+ "content": "<extra_id_88>",
118
+ "lstrip": true,
119
+ "normalized": false,
120
+ "rstrip": true,
121
+ "single_word": false,
122
+ "special": true
123
+ },
124
+ "32012": {
125
+ "content": "<extra_id_87>",
126
+ "lstrip": true,
127
+ "normalized": false,
128
+ "rstrip": true,
129
+ "single_word": false,
130
+ "special": true
131
+ },
132
+ "32013": {
133
+ "content": "<extra_id_86>",
134
+ "lstrip": true,
135
+ "normalized": false,
136
+ "rstrip": true,
137
+ "single_word": false,
138
+ "special": true
139
+ },
140
+ "32014": {
141
+ "content": "<extra_id_85>",
142
+ "lstrip": true,
143
+ "normalized": false,
144
+ "rstrip": true,
145
+ "single_word": false,
146
+ "special": true
147
+ },
148
+ "32015": {
149
+ "content": "<extra_id_84>",
150
+ "lstrip": true,
151
+ "normalized": false,
152
+ "rstrip": true,
153
+ "single_word": false,
154
+ "special": true
155
+ },
156
+ "32016": {
157
+ "content": "<extra_id_83>",
158
+ "lstrip": true,
159
+ "normalized": false,
160
+ "rstrip": true,
161
+ "single_word": false,
162
+ "special": true
163
+ },
164
+ "32017": {
165
+ "content": "<extra_id_82>",
166
+ "lstrip": true,
167
+ "normalized": false,
168
+ "rstrip": true,
169
+ "single_word": false,
170
+ "special": true
171
+ },
172
+ "32018": {
173
+ "content": "<extra_id_81>",
174
+ "lstrip": true,
175
+ "normalized": false,
176
+ "rstrip": true,
177
+ "single_word": false,
178
+ "special": true
179
+ },
180
+ "32019": {
181
+ "content": "<extra_id_80>",
182
+ "lstrip": true,
183
+ "normalized": false,
184
+ "rstrip": true,
185
+ "single_word": false,
186
+ "special": true
187
+ },
188
+ "32020": {
189
+ "content": "<extra_id_79>",
190
+ "lstrip": true,
191
+ "normalized": false,
192
+ "rstrip": true,
193
+ "single_word": false,
194
+ "special": true
195
+ },
196
+ "32021": {
197
+ "content": "<extra_id_78>",
198
+ "lstrip": true,
199
+ "normalized": false,
200
+ "rstrip": true,
201
+ "single_word": false,
202
+ "special": true
203
+ },
204
+ "32022": {
205
+ "content": "<extra_id_77>",
206
+ "lstrip": true,
207
+ "normalized": false,
208
+ "rstrip": true,
209
+ "single_word": false,
210
+ "special": true
211
+ },
212
+ "32023": {
213
+ "content": "<extra_id_76>",
214
+ "lstrip": true,
215
+ "normalized": false,
216
+ "rstrip": true,
217
+ "single_word": false,
218
+ "special": true
219
+ },
220
+ "32024": {
221
+ "content": "<extra_id_75>",
222
+ "lstrip": true,
223
+ "normalized": false,
224
+ "rstrip": true,
225
+ "single_word": false,
226
+ "special": true
227
+ },
228
+ "32025": {
229
+ "content": "<extra_id_74>",
230
+ "lstrip": true,
231
+ "normalized": false,
232
+ "rstrip": true,
233
+ "single_word": false,
234
+ "special": true
235
+ },
236
+ "32026": {
237
+ "content": "<extra_id_73>",
238
+ "lstrip": true,
239
+ "normalized": false,
240
+ "rstrip": true,
241
+ "single_word": false,
242
+ "special": true
243
+ },
244
+ "32027": {
245
+ "content": "<extra_id_72>",
246
+ "lstrip": true,
247
+ "normalized": false,
248
+ "rstrip": true,
249
+ "single_word": false,
250
+ "special": true
251
+ },
252
+ "32028": {
253
+ "content": "<extra_id_71>",
254
+ "lstrip": true,
255
+ "normalized": false,
256
+ "rstrip": true,
257
+ "single_word": false,
258
+ "special": true
259
+ },
260
+ "32029": {
261
+ "content": "<extra_id_70>",
262
+ "lstrip": true,
263
+ "normalized": false,
264
+ "rstrip": true,
265
+ "single_word": false,
266
+ "special": true
267
+ },
268
+ "32030": {
269
+ "content": "<extra_id_69>",
270
+ "lstrip": true,
271
+ "normalized": false,
272
+ "rstrip": true,
273
+ "single_word": false,
274
+ "special": true
275
+ },
276
+ "32031": {
277
+ "content": "<extra_id_68>",
278
+ "lstrip": true,
279
+ "normalized": false,
280
+ "rstrip": true,
281
+ "single_word": false,
282
+ "special": true
283
+ },
284
+ "32032": {
285
+ "content": "<extra_id_67>",
286
+ "lstrip": true,
287
+ "normalized": false,
288
+ "rstrip": true,
289
+ "single_word": false,
290
+ "special": true
291
+ },
292
+ "32033": {
293
+ "content": "<extra_id_66>",
294
+ "lstrip": true,
295
+ "normalized": false,
296
+ "rstrip": true,
297
+ "single_word": false,
298
+ "special": true
299
+ },
300
+ "32034": {
301
+ "content": "<extra_id_65>",
302
+ "lstrip": true,
303
+ "normalized": false,
304
+ "rstrip": true,
305
+ "single_word": false,
306
+ "special": true
307
+ },
308
+ "32035": {
309
+ "content": "<extra_id_64>",
310
+ "lstrip": true,
311
+ "normalized": false,
312
+ "rstrip": true,
313
+ "single_word": false,
314
+ "special": true
315
+ },
316
+ "32036": {
317
+ "content": "<extra_id_63>",
318
+ "lstrip": true,
319
+ "normalized": false,
320
+ "rstrip": true,
321
+ "single_word": false,
322
+ "special": true
323
+ },
324
+ "32037": {
325
+ "content": "<extra_id_62>",
326
+ "lstrip": true,
327
+ "normalized": false,
328
+ "rstrip": true,
329
+ "single_word": false,
330
+ "special": true
331
+ },
332
+ "32038": {
333
+ "content": "<extra_id_61>",
334
+ "lstrip": true,
335
+ "normalized": false,
336
+ "rstrip": true,
337
+ "single_word": false,
338
+ "special": true
339
+ },
340
+ "32039": {
341
+ "content": "<extra_id_60>",
342
+ "lstrip": true,
343
+ "normalized": false,
344
+ "rstrip": true,
345
+ "single_word": false,
346
+ "special": true
347
+ },
348
+ "32040": {
349
+ "content": "<extra_id_59>",
350
+ "lstrip": true,
351
+ "normalized": false,
352
+ "rstrip": true,
353
+ "single_word": false,
354
+ "special": true
355
+ },
356
+ "32041": {
357
+ "content": "<extra_id_58>",
358
+ "lstrip": true,
359
+ "normalized": false,
360
+ "rstrip": true,
361
+ "single_word": false,
362
+ "special": true
363
+ },
364
+ "32042": {
365
+ "content": "<extra_id_57>",
366
+ "lstrip": true,
367
+ "normalized": false,
368
+ "rstrip": true,
369
+ "single_word": false,
370
+ "special": true
371
+ },
372
+ "32043": {
373
+ "content": "<extra_id_56>",
374
+ "lstrip": true,
375
+ "normalized": false,
376
+ "rstrip": true,
377
+ "single_word": false,
378
+ "special": true
379
+ },
380
+ "32044": {
381
+ "content": "<extra_id_55>",
382
+ "lstrip": true,
383
+ "normalized": false,
384
+ "rstrip": true,
385
+ "single_word": false,
386
+ "special": true
387
+ },
388
+ "32045": {
389
+ "content": "<extra_id_54>",
390
+ "lstrip": true,
391
+ "normalized": false,
392
+ "rstrip": true,
393
+ "single_word": false,
394
+ "special": true
395
+ },
396
+ "32046": {
397
+ "content": "<extra_id_53>",
398
+ "lstrip": true,
399
+ "normalized": false,
400
+ "rstrip": true,
401
+ "single_word": false,
402
+ "special": true
403
+ },
404
+ "32047": {
405
+ "content": "<extra_id_52>",
406
+ "lstrip": true,
407
+ "normalized": false,
408
+ "rstrip": true,
409
+ "single_word": false,
410
+ "special": true
411
+ },
412
+ "32048": {
413
+ "content": "<extra_id_51>",
414
+ "lstrip": true,
415
+ "normalized": false,
416
+ "rstrip": true,
417
+ "single_word": false,
418
+ "special": true
419
+ },
420
+ "32049": {
421
+ "content": "<extra_id_50>",
422
+ "lstrip": true,
423
+ "normalized": false,
424
+ "rstrip": true,
425
+ "single_word": false,
426
+ "special": true
427
+ },
428
+ "32050": {
429
+ "content": "<extra_id_49>",
430
+ "lstrip": true,
431
+ "normalized": false,
432
+ "rstrip": true,
433
+ "single_word": false,
434
+ "special": true
435
+ },
436
+ "32051": {
437
+ "content": "<extra_id_48>",
438
+ "lstrip": true,
439
+ "normalized": false,
440
+ "rstrip": true,
441
+ "single_word": false,
442
+ "special": true
443
+ },
444
+ "32052": {
445
+ "content": "<extra_id_47>",
446
+ "lstrip": true,
447
+ "normalized": false,
448
+ "rstrip": true,
449
+ "single_word": false,
450
+ "special": true
451
+ },
452
+ "32053": {
453
+ "content": "<extra_id_46>",
454
+ "lstrip": true,
455
+ "normalized": false,
456
+ "rstrip": true,
457
+ "single_word": false,
458
+ "special": true
459
+ },
460
+ "32054": {
461
+ "content": "<extra_id_45>",
462
+ "lstrip": true,
463
+ "normalized": false,
464
+ "rstrip": true,
465
+ "single_word": false,
466
+ "special": true
467
+ },
468
+ "32055": {
469
+ "content": "<extra_id_44>",
470
+ "lstrip": true,
471
+ "normalized": false,
472
+ "rstrip": true,
473
+ "single_word": false,
474
+ "special": true
475
+ },
476
+ "32056": {
477
+ "content": "<extra_id_43>",
478
+ "lstrip": true,
479
+ "normalized": false,
480
+ "rstrip": true,
481
+ "single_word": false,
482
+ "special": true
483
+ },
484
+ "32057": {
485
+ "content": "<extra_id_42>",
486
+ "lstrip": true,
487
+ "normalized": false,
488
+ "rstrip": true,
489
+ "single_word": false,
490
+ "special": true
491
+ },
492
+ "32058": {
493
+ "content": "<extra_id_41>",
494
+ "lstrip": true,
495
+ "normalized": false,
496
+ "rstrip": true,
497
+ "single_word": false,
498
+ "special": true
499
+ },
500
+ "32059": {
501
+ "content": "<extra_id_40>",
502
+ "lstrip": true,
503
+ "normalized": false,
504
+ "rstrip": true,
505
+ "single_word": false,
506
+ "special": true
507
+ },
508
+ "32060": {
509
+ "content": "<extra_id_39>",
510
+ "lstrip": true,
511
+ "normalized": false,
512
+ "rstrip": true,
513
+ "single_word": false,
514
+ "special": true
515
+ },
516
+ "32061": {
517
+ "content": "<extra_id_38>",
518
+ "lstrip": true,
519
+ "normalized": false,
520
+ "rstrip": true,
521
+ "single_word": false,
522
+ "special": true
523
+ },
524
+ "32062": {
525
+ "content": "<extra_id_37>",
526
+ "lstrip": true,
527
+ "normalized": false,
528
+ "rstrip": true,
529
+ "single_word": false,
530
+ "special": true
531
+ },
532
+ "32063": {
533
+ "content": "<extra_id_36>",
534
+ "lstrip": true,
535
+ "normalized": false,
536
+ "rstrip": true,
537
+ "single_word": false,
538
+ "special": true
539
+ },
540
+ "32064": {
541
+ "content": "<extra_id_35>",
542
+ "lstrip": true,
543
+ "normalized": false,
544
+ "rstrip": true,
545
+ "single_word": false,
546
+ "special": true
547
+ },
548
+ "32065": {
549
+ "content": "<extra_id_34>",
550
+ "lstrip": true,
551
+ "normalized": false,
552
+ "rstrip": true,
553
+ "single_word": false,
554
+ "special": true
555
+ },
556
+ "32066": {
557
+ "content": "<extra_id_33>",
558
+ "lstrip": true,
559
+ "normalized": false,
560
+ "rstrip": true,
561
+ "single_word": false,
562
+ "special": true
563
+ },
564
+ "32067": {
565
+ "content": "<extra_id_32>",
566
+ "lstrip": true,
567
+ "normalized": false,
568
+ "rstrip": true,
569
+ "single_word": false,
570
+ "special": true
571
+ },
572
+ "32068": {
573
+ "content": "<extra_id_31>",
574
+ "lstrip": true,
575
+ "normalized": false,
576
+ "rstrip": true,
577
+ "single_word": false,
578
+ "special": true
579
+ },
580
+ "32069": {
581
+ "content": "<extra_id_30>",
582
+ "lstrip": true,
583
+ "normalized": false,
584
+ "rstrip": true,
585
+ "single_word": false,
586
+ "special": true
587
+ },
588
+ "32070": {
589
+ "content": "<extra_id_29>",
590
+ "lstrip": true,
591
+ "normalized": false,
592
+ "rstrip": true,
593
+ "single_word": false,
594
+ "special": true
595
+ },
596
+ "32071": {
597
+ "content": "<extra_id_28>",
598
+ "lstrip": true,
599
+ "normalized": false,
600
+ "rstrip": true,
601
+ "single_word": false,
602
+ "special": true
603
+ },
604
+ "32072": {
605
+ "content": "<extra_id_27>",
606
+ "lstrip": true,
607
+ "normalized": false,
608
+ "rstrip": true,
609
+ "single_word": false,
610
+ "special": true
611
+ },
612
+ "32073": {
613
+ "content": "<extra_id_26>",
614
+ "lstrip": true,
615
+ "normalized": false,
616
+ "rstrip": true,
617
+ "single_word": false,
618
+ "special": true
619
+ },
620
+ "32074": {
621
+ "content": "<extra_id_25>",
622
+ "lstrip": true,
623
+ "normalized": false,
624
+ "rstrip": true,
625
+ "single_word": false,
626
+ "special": true
627
+ },
628
+ "32075": {
629
+ "content": "<extra_id_24>",
630
+ "lstrip": true,
631
+ "normalized": false,
632
+ "rstrip": true,
633
+ "single_word": false,
634
+ "special": true
635
+ },
636
+ "32076": {
637
+ "content": "<extra_id_23>",
638
+ "lstrip": true,
639
+ "normalized": false,
640
+ "rstrip": true,
641
+ "single_word": false,
642
+ "special": true
643
+ },
644
+ "32077": {
645
+ "content": "<extra_id_22>",
646
+ "lstrip": true,
647
+ "normalized": false,
648
+ "rstrip": true,
649
+ "single_word": false,
650
+ "special": true
651
+ },
652
+ "32078": {
653
+ "content": "<extra_id_21>",
654
+ "lstrip": true,
655
+ "normalized": false,
656
+ "rstrip": true,
657
+ "single_word": false,
658
+ "special": true
659
+ },
660
+ "32079": {
661
+ "content": "<extra_id_20>",
662
+ "lstrip": true,
663
+ "normalized": false,
664
+ "rstrip": true,
665
+ "single_word": false,
666
+ "special": true
667
+ },
668
+ "32080": {
669
+ "content": "<extra_id_19>",
670
+ "lstrip": true,
671
+ "normalized": false,
672
+ "rstrip": true,
673
+ "single_word": false,
674
+ "special": true
675
+ },
676
+ "32081": {
677
+ "content": "<extra_id_18>",
678
+ "lstrip": true,
679
+ "normalized": false,
680
+ "rstrip": true,
681
+ "single_word": false,
682
+ "special": true
683
+ },
684
+ "32082": {
685
+ "content": "<extra_id_17>",
686
+ "lstrip": true,
687
+ "normalized": false,
688
+ "rstrip": true,
689
+ "single_word": false,
690
+ "special": true
691
+ },
692
+ "32083": {
693
+ "content": "<extra_id_16>",
694
+ "lstrip": true,
695
+ "normalized": false,
696
+ "rstrip": true,
697
+ "single_word": false,
698
+ "special": true
699
+ },
700
+ "32084": {
701
+ "content": "<extra_id_15>",
702
+ "lstrip": true,
703
+ "normalized": false,
704
+ "rstrip": true,
705
+ "single_word": false,
706
+ "special": true
707
+ },
708
+ "32085": {
709
+ "content": "<extra_id_14>",
710
+ "lstrip": true,
711
+ "normalized": false,
712
+ "rstrip": true,
713
+ "single_word": false,
714
+ "special": true
715
+ },
716
+ "32086": {
717
+ "content": "<extra_id_13>",
718
+ "lstrip": true,
719
+ "normalized": false,
720
+ "rstrip": true,
721
+ "single_word": false,
722
+ "special": true
723
+ },
724
+ "32087": {
725
+ "content": "<extra_id_12>",
726
+ "lstrip": true,
727
+ "normalized": false,
728
+ "rstrip": true,
729
+ "single_word": false,
730
+ "special": true
731
+ },
732
+ "32088": {
733
+ "content": "<extra_id_11>",
734
+ "lstrip": true,
735
+ "normalized": false,
736
+ "rstrip": true,
737
+ "single_word": false,
738
+ "special": true
739
+ },
740
+ "32089": {
741
+ "content": "<extra_id_10>",
742
+ "lstrip": true,
743
+ "normalized": false,
744
+ "rstrip": true,
745
+ "single_word": false,
746
+ "special": true
747
+ },
748
+ "32090": {
749
+ "content": "<extra_id_9>",
750
+ "lstrip": true,
751
+ "normalized": false,
752
+ "rstrip": true,
753
+ "single_word": false,
754
+ "special": true
755
+ },
756
+ "32091": {
757
+ "content": "<extra_id_8>",
758
+ "lstrip": true,
759
+ "normalized": false,
760
+ "rstrip": true,
761
+ "single_word": false,
762
+ "special": true
763
+ },
764
+ "32092": {
765
+ "content": "<extra_id_7>",
766
+ "lstrip": true,
767
+ "normalized": false,
768
+ "rstrip": true,
769
+ "single_word": false,
770
+ "special": true
771
+ },
772
+ "32093": {
773
+ "content": "<extra_id_6>",
774
+ "lstrip": true,
775
+ "normalized": false,
776
+ "rstrip": true,
777
+ "single_word": false,
778
+ "special": true
779
+ },
780
+ "32094": {
781
+ "content": "<extra_id_5>",
782
+ "lstrip": true,
783
+ "normalized": false,
784
+ "rstrip": true,
785
+ "single_word": false,
786
+ "special": true
787
+ },
788
+ "32095": {
789
+ "content": "<extra_id_4>",
790
+ "lstrip": true,
791
+ "normalized": false,
792
+ "rstrip": true,
793
+ "single_word": false,
794
+ "special": true
795
+ },
796
+ "32096": {
797
+ "content": "<extra_id_3>",
798
+ "lstrip": true,
799
+ "normalized": false,
800
+ "rstrip": true,
801
+ "single_word": false,
802
+ "special": true
803
+ },
804
+ "32097": {
805
+ "content": "<extra_id_2>",
806
+ "lstrip": true,
807
+ "normalized": false,
808
+ "rstrip": true,
809
+ "single_word": false,
810
+ "special": true
811
+ },
812
+ "32098": {
813
+ "content": "<extra_id_1>",
814
+ "lstrip": true,
815
+ "normalized": false,
816
+ "rstrip": true,
817
+ "single_word": false,
818
+ "special": true
819
+ },
820
+ "32099": {
821
+ "content": "<extra_id_0>",
822
+ "lstrip": true,
823
+ "normalized": false,
824
+ "rstrip": true,
825
+ "single_word": false,
826
+ "special": true
827
+ }
828
+ },
829
+ "additional_special_tokens": [
830
+ "<extra_id_0>",
831
+ "<extra_id_1>",
832
+ "<extra_id_2>",
833
+ "<extra_id_3>",
834
+ "<extra_id_4>",
835
+ "<extra_id_5>",
836
+ "<extra_id_6>",
837
+ "<extra_id_7>",
838
+ "<extra_id_8>",
839
+ "<extra_id_9>",
840
+ "<extra_id_10>",
841
+ "<extra_id_11>",
842
+ "<extra_id_12>",
843
+ "<extra_id_13>",
844
+ "<extra_id_14>",
845
+ "<extra_id_15>",
846
+ "<extra_id_16>",
847
+ "<extra_id_17>",
848
+ "<extra_id_18>",
849
+ "<extra_id_19>",
850
+ "<extra_id_20>",
851
+ "<extra_id_21>",
852
+ "<extra_id_22>",
853
+ "<extra_id_23>",
854
+ "<extra_id_24>",
855
+ "<extra_id_25>",
856
+ "<extra_id_26>",
857
+ "<extra_id_27>",
858
+ "<extra_id_28>",
859
+ "<extra_id_29>",
860
+ "<extra_id_30>",
861
+ "<extra_id_31>",
862
+ "<extra_id_32>",
863
+ "<extra_id_33>",
864
+ "<extra_id_34>",
865
+ "<extra_id_35>",
866
+ "<extra_id_36>",
867
+ "<extra_id_37>",
868
+ "<extra_id_38>",
869
+ "<extra_id_39>",
870
+ "<extra_id_40>",
871
+ "<extra_id_41>",
872
+ "<extra_id_42>",
873
+ "<extra_id_43>",
874
+ "<extra_id_44>",
875
+ "<extra_id_45>",
876
+ "<extra_id_46>",
877
+ "<extra_id_47>",
878
+ "<extra_id_48>",
879
+ "<extra_id_49>",
880
+ "<extra_id_50>",
881
+ "<extra_id_51>",
882
+ "<extra_id_52>",
883
+ "<extra_id_53>",
884
+ "<extra_id_54>",
885
+ "<extra_id_55>",
886
+ "<extra_id_56>",
887
+ "<extra_id_57>",
888
+ "<extra_id_58>",
889
+ "<extra_id_59>",
890
+ "<extra_id_60>",
891
+ "<extra_id_61>",
892
+ "<extra_id_62>",
893
+ "<extra_id_63>",
894
+ "<extra_id_64>",
895
+ "<extra_id_65>",
896
+ "<extra_id_66>",
897
+ "<extra_id_67>",
898
+ "<extra_id_68>",
899
+ "<extra_id_69>",
900
+ "<extra_id_70>",
901
+ "<extra_id_71>",
902
+ "<extra_id_72>",
903
+ "<extra_id_73>",
904
+ "<extra_id_74>",
905
+ "<extra_id_75>",
906
+ "<extra_id_76>",
907
+ "<extra_id_77>",
908
+ "<extra_id_78>",
909
+ "<extra_id_79>",
910
+ "<extra_id_80>",
911
+ "<extra_id_81>",
912
+ "<extra_id_82>",
913
+ "<extra_id_83>",
914
+ "<extra_id_84>",
915
+ "<extra_id_85>",
916
+ "<extra_id_86>",
917
+ "<extra_id_87>",
918
+ "<extra_id_88>",
919
+ "<extra_id_89>",
920
+ "<extra_id_90>",
921
+ "<extra_id_91>",
922
+ "<extra_id_92>",
923
+ "<extra_id_93>",
924
+ "<extra_id_94>",
925
+ "<extra_id_95>",
926
+ "<extra_id_96>",
927
+ "<extra_id_97>",
928
+ "<extra_id_98>",
929
+ "<extra_id_99>"
930
+ ],
931
+ "clean_up_tokenization_spaces": false,
932
+ "eos_token": "</s>",
933
+ "extra_ids": 100,
934
+ "legacy": true,
935
+ "model_max_length": 128,
936
+ "pad_token": "<pad>",
937
+ "sp_model_kwargs": {},
938
+ "tokenizer_class": "T5Tokenizer",
939
+ "unk_token": "<unk>"
940
+ }
LTX-Video/MODEL_DIR/transformer/config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "LTXVideoTransformer3DModel",
3
+ "_diffusers_version": "0.32.0.dev0",
4
+ "activation_fn": "gelu-approximate",
5
+ "attention_bias": true,
6
+ "attention_head_dim": 64,
7
+ "attention_out_bias": true,
8
+ "caption_channels": 4096,
9
+ "cross_attention_dim": 2048,
10
+ "in_channels": 128,
11
+ "norm_elementwise_affine": false,
12
+ "norm_eps": 1e-06,
13
+ "num_attention_heads": 32,
14
+ "num_layers": 28,
15
+ "out_channels": 128,
16
+ "patch_size": 1,
17
+ "patch_size_t": 1,
18
+ "qk_norm": "rms_norm_across_heads"
19
+ }
LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8acd3e0bda74f7434259a4543a324211ddd82580fcc727df236b2414591eadc8
3
+ size 4939189200
LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:03b3c822c31e1a9e00f6f575aa1b6f3cc4cc3797f60dcced537c8600bf1e9019
3
+ size 2754433648
LTX-Video/MODEL_DIR/transformer/diffusion_pytorch_model.safetensors.index.json ADDED
@@ -0,0 +1,722 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 7693541888
4
+ },
5
+ "weight_map": {
6
+ "caption_projection.linear_1.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
7
+ "caption_projection.linear_1.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
8
+ "caption_projection.linear_2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
9
+ "caption_projection.linear_2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
10
+ "proj_in.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
11
+ "proj_in.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
12
+ "proj_out.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
13
+ "proj_out.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
14
+ "scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
15
+ "time_embed.emb.timestep_embedder.linear_1.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
16
+ "time_embed.emb.timestep_embedder.linear_1.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
17
+ "time_embed.emb.timestep_embedder.linear_2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
18
+ "time_embed.emb.timestep_embedder.linear_2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
19
+ "time_embed.linear.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
20
+ "time_embed.linear.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
21
+ "transformer_blocks.0.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
22
+ "transformer_blocks.0.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
23
+ "transformer_blocks.0.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
24
+ "transformer_blocks.0.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
25
+ "transformer_blocks.0.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
26
+ "transformer_blocks.0.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
27
+ "transformer_blocks.0.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
28
+ "transformer_blocks.0.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
29
+ "transformer_blocks.0.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
30
+ "transformer_blocks.0.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
31
+ "transformer_blocks.0.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
32
+ "transformer_blocks.0.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
33
+ "transformer_blocks.0.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
34
+ "transformer_blocks.0.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
35
+ "transformer_blocks.0.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
36
+ "transformer_blocks.0.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
37
+ "transformer_blocks.0.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
38
+ "transformer_blocks.0.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
39
+ "transformer_blocks.0.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
40
+ "transformer_blocks.0.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
41
+ "transformer_blocks.0.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
42
+ "transformer_blocks.0.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
43
+ "transformer_blocks.0.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
44
+ "transformer_blocks.0.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
45
+ "transformer_blocks.0.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
46
+ "transformer_blocks.1.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
47
+ "transformer_blocks.1.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
48
+ "transformer_blocks.1.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
49
+ "transformer_blocks.1.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
50
+ "transformer_blocks.1.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
51
+ "transformer_blocks.1.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
52
+ "transformer_blocks.1.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
53
+ "transformer_blocks.1.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
54
+ "transformer_blocks.1.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
55
+ "transformer_blocks.1.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
56
+ "transformer_blocks.1.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
57
+ "transformer_blocks.1.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
58
+ "transformer_blocks.1.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
59
+ "transformer_blocks.1.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
60
+ "transformer_blocks.1.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
61
+ "transformer_blocks.1.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
62
+ "transformer_blocks.1.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
63
+ "transformer_blocks.1.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
64
+ "transformer_blocks.1.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
65
+ "transformer_blocks.1.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
66
+ "transformer_blocks.1.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
67
+ "transformer_blocks.1.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
68
+ "transformer_blocks.1.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
69
+ "transformer_blocks.1.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
70
+ "transformer_blocks.1.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
71
+ "transformer_blocks.10.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
72
+ "transformer_blocks.10.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
73
+ "transformer_blocks.10.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
74
+ "transformer_blocks.10.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
75
+ "transformer_blocks.10.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
76
+ "transformer_blocks.10.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
77
+ "transformer_blocks.10.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
78
+ "transformer_blocks.10.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
79
+ "transformer_blocks.10.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
80
+ "transformer_blocks.10.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
81
+ "transformer_blocks.10.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
82
+ "transformer_blocks.10.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
83
+ "transformer_blocks.10.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
84
+ "transformer_blocks.10.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
85
+ "transformer_blocks.10.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
86
+ "transformer_blocks.10.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
87
+ "transformer_blocks.10.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
88
+ "transformer_blocks.10.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
89
+ "transformer_blocks.10.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
90
+ "transformer_blocks.10.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
91
+ "transformer_blocks.10.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
92
+ "transformer_blocks.10.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
93
+ "transformer_blocks.10.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
94
+ "transformer_blocks.10.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
95
+ "transformer_blocks.10.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
96
+ "transformer_blocks.11.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
97
+ "transformer_blocks.11.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
98
+ "transformer_blocks.11.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
99
+ "transformer_blocks.11.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
100
+ "transformer_blocks.11.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
101
+ "transformer_blocks.11.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
102
+ "transformer_blocks.11.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
103
+ "transformer_blocks.11.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
104
+ "transformer_blocks.11.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
105
+ "transformer_blocks.11.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
106
+ "transformer_blocks.11.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
107
+ "transformer_blocks.11.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
108
+ "transformer_blocks.11.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
109
+ "transformer_blocks.11.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
110
+ "transformer_blocks.11.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
111
+ "transformer_blocks.11.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
112
+ "transformer_blocks.11.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
113
+ "transformer_blocks.11.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
114
+ "transformer_blocks.11.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
115
+ "transformer_blocks.11.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
116
+ "transformer_blocks.11.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
117
+ "transformer_blocks.11.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
118
+ "transformer_blocks.11.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
119
+ "transformer_blocks.11.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
120
+ "transformer_blocks.11.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
121
+ "transformer_blocks.12.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
122
+ "transformer_blocks.12.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
123
+ "transformer_blocks.12.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
124
+ "transformer_blocks.12.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
125
+ "transformer_blocks.12.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
126
+ "transformer_blocks.12.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
127
+ "transformer_blocks.12.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
128
+ "transformer_blocks.12.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
129
+ "transformer_blocks.12.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
130
+ "transformer_blocks.12.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
131
+ "transformer_blocks.12.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
132
+ "transformer_blocks.12.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
133
+ "transformer_blocks.12.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
134
+ "transformer_blocks.12.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
135
+ "transformer_blocks.12.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
136
+ "transformer_blocks.12.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
137
+ "transformer_blocks.12.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
138
+ "transformer_blocks.12.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
139
+ "transformer_blocks.12.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
140
+ "transformer_blocks.12.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
141
+ "transformer_blocks.12.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
142
+ "transformer_blocks.12.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
143
+ "transformer_blocks.12.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
144
+ "transformer_blocks.12.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
145
+ "transformer_blocks.12.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
146
+ "transformer_blocks.13.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
147
+ "transformer_blocks.13.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
148
+ "transformer_blocks.13.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
149
+ "transformer_blocks.13.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
150
+ "transformer_blocks.13.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
151
+ "transformer_blocks.13.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
152
+ "transformer_blocks.13.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
153
+ "transformer_blocks.13.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
154
+ "transformer_blocks.13.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
155
+ "transformer_blocks.13.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
156
+ "transformer_blocks.13.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
157
+ "transformer_blocks.13.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
158
+ "transformer_blocks.13.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
159
+ "transformer_blocks.13.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
160
+ "transformer_blocks.13.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
161
+ "transformer_blocks.13.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
162
+ "transformer_blocks.13.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
163
+ "transformer_blocks.13.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
164
+ "transformer_blocks.13.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
165
+ "transformer_blocks.13.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
166
+ "transformer_blocks.13.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
167
+ "transformer_blocks.13.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
168
+ "transformer_blocks.13.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
169
+ "transformer_blocks.13.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
170
+ "transformer_blocks.13.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
171
+ "transformer_blocks.14.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
172
+ "transformer_blocks.14.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
173
+ "transformer_blocks.14.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
174
+ "transformer_blocks.14.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
175
+ "transformer_blocks.14.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
176
+ "transformer_blocks.14.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
177
+ "transformer_blocks.14.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
178
+ "transformer_blocks.14.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
179
+ "transformer_blocks.14.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
180
+ "transformer_blocks.14.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
181
+ "transformer_blocks.14.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
182
+ "transformer_blocks.14.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
183
+ "transformer_blocks.14.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
184
+ "transformer_blocks.14.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
185
+ "transformer_blocks.14.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
186
+ "transformer_blocks.14.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
187
+ "transformer_blocks.14.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
188
+ "transformer_blocks.14.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
189
+ "transformer_blocks.14.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
190
+ "transformer_blocks.14.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
191
+ "transformer_blocks.14.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
192
+ "transformer_blocks.14.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
193
+ "transformer_blocks.14.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
194
+ "transformer_blocks.14.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
195
+ "transformer_blocks.14.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
196
+ "transformer_blocks.15.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
197
+ "transformer_blocks.15.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
198
+ "transformer_blocks.15.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
199
+ "transformer_blocks.15.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
200
+ "transformer_blocks.15.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
201
+ "transformer_blocks.15.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
202
+ "transformer_blocks.15.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
203
+ "transformer_blocks.15.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
204
+ "transformer_blocks.15.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
205
+ "transformer_blocks.15.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
206
+ "transformer_blocks.15.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
207
+ "transformer_blocks.15.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
208
+ "transformer_blocks.15.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
209
+ "transformer_blocks.15.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
210
+ "transformer_blocks.15.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
211
+ "transformer_blocks.15.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
212
+ "transformer_blocks.15.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
213
+ "transformer_blocks.15.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
214
+ "transformer_blocks.15.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
215
+ "transformer_blocks.15.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
216
+ "transformer_blocks.15.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
217
+ "transformer_blocks.15.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
218
+ "transformer_blocks.15.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
219
+ "transformer_blocks.15.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
220
+ "transformer_blocks.15.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
221
+ "transformer_blocks.16.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
222
+ "transformer_blocks.16.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
223
+ "transformer_blocks.16.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
224
+ "transformer_blocks.16.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
225
+ "transformer_blocks.16.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
226
+ "transformer_blocks.16.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
227
+ "transformer_blocks.16.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
228
+ "transformer_blocks.16.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
229
+ "transformer_blocks.16.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
230
+ "transformer_blocks.16.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
231
+ "transformer_blocks.16.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
232
+ "transformer_blocks.16.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
233
+ "transformer_blocks.16.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
234
+ "transformer_blocks.16.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
235
+ "transformer_blocks.16.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
236
+ "transformer_blocks.16.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
237
+ "transformer_blocks.16.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
238
+ "transformer_blocks.16.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
239
+ "transformer_blocks.16.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
240
+ "transformer_blocks.16.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
241
+ "transformer_blocks.16.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
242
+ "transformer_blocks.16.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
243
+ "transformer_blocks.16.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
244
+ "transformer_blocks.16.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
245
+ "transformer_blocks.16.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
246
+ "transformer_blocks.17.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
247
+ "transformer_blocks.17.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
248
+ "transformer_blocks.17.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
249
+ "transformer_blocks.17.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
250
+ "transformer_blocks.17.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
251
+ "transformer_blocks.17.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
252
+ "transformer_blocks.17.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
253
+ "transformer_blocks.17.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
254
+ "transformer_blocks.17.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
255
+ "transformer_blocks.17.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
256
+ "transformer_blocks.17.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
257
+ "transformer_blocks.17.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
258
+ "transformer_blocks.17.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
259
+ "transformer_blocks.17.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
260
+ "transformer_blocks.17.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
261
+ "transformer_blocks.17.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
262
+ "transformer_blocks.17.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
263
+ "transformer_blocks.17.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
264
+ "transformer_blocks.17.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
265
+ "transformer_blocks.17.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
266
+ "transformer_blocks.17.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
267
+ "transformer_blocks.17.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
268
+ "transformer_blocks.17.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
269
+ "transformer_blocks.17.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
270
+ "transformer_blocks.17.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
271
+ "transformer_blocks.18.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
272
+ "transformer_blocks.18.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
273
+ "transformer_blocks.18.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
274
+ "transformer_blocks.18.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
275
+ "transformer_blocks.18.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
276
+ "transformer_blocks.18.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
277
+ "transformer_blocks.18.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
278
+ "transformer_blocks.18.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
279
+ "transformer_blocks.18.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
280
+ "transformer_blocks.18.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
281
+ "transformer_blocks.18.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
282
+ "transformer_blocks.18.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
283
+ "transformer_blocks.18.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
284
+ "transformer_blocks.18.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
285
+ "transformer_blocks.18.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
286
+ "transformer_blocks.18.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
287
+ "transformer_blocks.18.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
288
+ "transformer_blocks.18.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
289
+ "transformer_blocks.18.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
290
+ "transformer_blocks.18.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
291
+ "transformer_blocks.18.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
292
+ "transformer_blocks.18.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
293
+ "transformer_blocks.18.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
294
+ "transformer_blocks.18.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
295
+ "transformer_blocks.18.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
296
+ "transformer_blocks.19.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
297
+ "transformer_blocks.19.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
298
+ "transformer_blocks.19.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
299
+ "transformer_blocks.19.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
300
+ "transformer_blocks.19.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
301
+ "transformer_blocks.19.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
302
+ "transformer_blocks.19.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
303
+ "transformer_blocks.19.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
304
+ "transformer_blocks.19.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
305
+ "transformer_blocks.19.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
306
+ "transformer_blocks.19.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
307
+ "transformer_blocks.19.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
308
+ "transformer_blocks.19.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
309
+ "transformer_blocks.19.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
310
+ "transformer_blocks.19.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
311
+ "transformer_blocks.19.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
312
+ "transformer_blocks.19.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
313
+ "transformer_blocks.19.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
314
+ "transformer_blocks.19.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
315
+ "transformer_blocks.19.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
316
+ "transformer_blocks.19.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
317
+ "transformer_blocks.19.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
318
+ "transformer_blocks.19.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
319
+ "transformer_blocks.19.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
320
+ "transformer_blocks.19.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
321
+ "transformer_blocks.2.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
322
+ "transformer_blocks.2.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
323
+ "transformer_blocks.2.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
324
+ "transformer_blocks.2.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
325
+ "transformer_blocks.2.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
326
+ "transformer_blocks.2.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
327
+ "transformer_blocks.2.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
328
+ "transformer_blocks.2.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
329
+ "transformer_blocks.2.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
330
+ "transformer_blocks.2.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
331
+ "transformer_blocks.2.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
332
+ "transformer_blocks.2.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
333
+ "transformer_blocks.2.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
334
+ "transformer_blocks.2.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
335
+ "transformer_blocks.2.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
336
+ "transformer_blocks.2.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
337
+ "transformer_blocks.2.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
338
+ "transformer_blocks.2.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
339
+ "transformer_blocks.2.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
340
+ "transformer_blocks.2.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
341
+ "transformer_blocks.2.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
342
+ "transformer_blocks.2.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
343
+ "transformer_blocks.2.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
344
+ "transformer_blocks.2.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
345
+ "transformer_blocks.2.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
346
+ "transformer_blocks.20.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
347
+ "transformer_blocks.20.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
348
+ "transformer_blocks.20.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
349
+ "transformer_blocks.20.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
350
+ "transformer_blocks.20.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
351
+ "transformer_blocks.20.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
352
+ "transformer_blocks.20.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
353
+ "transformer_blocks.20.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
354
+ "transformer_blocks.20.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
355
+ "transformer_blocks.20.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
356
+ "transformer_blocks.20.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
357
+ "transformer_blocks.20.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
358
+ "transformer_blocks.20.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
359
+ "transformer_blocks.20.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
360
+ "transformer_blocks.20.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
361
+ "transformer_blocks.20.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
362
+ "transformer_blocks.20.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
363
+ "transformer_blocks.20.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
364
+ "transformer_blocks.20.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
365
+ "transformer_blocks.20.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
366
+ "transformer_blocks.20.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
367
+ "transformer_blocks.20.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
368
+ "transformer_blocks.20.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
369
+ "transformer_blocks.20.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
370
+ "transformer_blocks.20.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
371
+ "transformer_blocks.21.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
372
+ "transformer_blocks.21.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
373
+ "transformer_blocks.21.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
374
+ "transformer_blocks.21.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
375
+ "transformer_blocks.21.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
376
+ "transformer_blocks.21.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
377
+ "transformer_blocks.21.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
378
+ "transformer_blocks.21.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
379
+ "transformer_blocks.21.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
380
+ "transformer_blocks.21.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
381
+ "transformer_blocks.21.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
382
+ "transformer_blocks.21.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
383
+ "transformer_blocks.21.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
384
+ "transformer_blocks.21.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
385
+ "transformer_blocks.21.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
386
+ "transformer_blocks.21.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
387
+ "transformer_blocks.21.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
388
+ "transformer_blocks.21.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
389
+ "transformer_blocks.21.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
390
+ "transformer_blocks.21.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
391
+ "transformer_blocks.21.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
392
+ "transformer_blocks.21.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
393
+ "transformer_blocks.21.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
394
+ "transformer_blocks.21.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
395
+ "transformer_blocks.21.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
396
+ "transformer_blocks.22.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
397
+ "transformer_blocks.22.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
398
+ "transformer_blocks.22.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
399
+ "transformer_blocks.22.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
400
+ "transformer_blocks.22.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
401
+ "transformer_blocks.22.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
402
+ "transformer_blocks.22.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
403
+ "transformer_blocks.22.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
404
+ "transformer_blocks.22.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
405
+ "transformer_blocks.22.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
406
+ "transformer_blocks.22.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
407
+ "transformer_blocks.22.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
408
+ "transformer_blocks.22.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
409
+ "transformer_blocks.22.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
410
+ "transformer_blocks.22.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
411
+ "transformer_blocks.22.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
412
+ "transformer_blocks.22.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
413
+ "transformer_blocks.22.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
414
+ "transformer_blocks.22.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
415
+ "transformer_blocks.22.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
416
+ "transformer_blocks.22.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
417
+ "transformer_blocks.22.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
418
+ "transformer_blocks.22.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
419
+ "transformer_blocks.22.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
420
+ "transformer_blocks.22.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
421
+ "transformer_blocks.23.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
422
+ "transformer_blocks.23.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
423
+ "transformer_blocks.23.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
424
+ "transformer_blocks.23.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
425
+ "transformer_blocks.23.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
426
+ "transformer_blocks.23.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
427
+ "transformer_blocks.23.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
428
+ "transformer_blocks.23.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
429
+ "transformer_blocks.23.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
430
+ "transformer_blocks.23.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
431
+ "transformer_blocks.23.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
432
+ "transformer_blocks.23.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
433
+ "transformer_blocks.23.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
434
+ "transformer_blocks.23.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
435
+ "transformer_blocks.23.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
436
+ "transformer_blocks.23.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
437
+ "transformer_blocks.23.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
438
+ "transformer_blocks.23.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
439
+ "transformer_blocks.23.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
440
+ "transformer_blocks.23.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
441
+ "transformer_blocks.23.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
442
+ "transformer_blocks.23.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
443
+ "transformer_blocks.23.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
444
+ "transformer_blocks.23.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
445
+ "transformer_blocks.23.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
446
+ "transformer_blocks.24.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
447
+ "transformer_blocks.24.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
448
+ "transformer_blocks.24.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
449
+ "transformer_blocks.24.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
450
+ "transformer_blocks.24.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
451
+ "transformer_blocks.24.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
452
+ "transformer_blocks.24.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
453
+ "transformer_blocks.24.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
454
+ "transformer_blocks.24.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
455
+ "transformer_blocks.24.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
456
+ "transformer_blocks.24.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
457
+ "transformer_blocks.24.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
458
+ "transformer_blocks.24.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
459
+ "transformer_blocks.24.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
460
+ "transformer_blocks.24.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
461
+ "transformer_blocks.24.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
462
+ "transformer_blocks.24.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
463
+ "transformer_blocks.24.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
464
+ "transformer_blocks.24.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
465
+ "transformer_blocks.24.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
466
+ "transformer_blocks.24.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
467
+ "transformer_blocks.24.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
468
+ "transformer_blocks.24.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
469
+ "transformer_blocks.24.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
470
+ "transformer_blocks.24.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
471
+ "transformer_blocks.25.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
472
+ "transformer_blocks.25.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
473
+ "transformer_blocks.25.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
474
+ "transformer_blocks.25.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
475
+ "transformer_blocks.25.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
476
+ "transformer_blocks.25.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
477
+ "transformer_blocks.25.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
478
+ "transformer_blocks.25.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
479
+ "transformer_blocks.25.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
480
+ "transformer_blocks.25.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
481
+ "transformer_blocks.25.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
482
+ "transformer_blocks.25.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
483
+ "transformer_blocks.25.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
484
+ "transformer_blocks.25.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
485
+ "transformer_blocks.25.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
486
+ "transformer_blocks.25.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
487
+ "transformer_blocks.25.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
488
+ "transformer_blocks.25.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
489
+ "transformer_blocks.25.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
490
+ "transformer_blocks.25.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
491
+ "transformer_blocks.25.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
492
+ "transformer_blocks.25.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
493
+ "transformer_blocks.25.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
494
+ "transformer_blocks.25.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
495
+ "transformer_blocks.25.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
496
+ "transformer_blocks.26.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
497
+ "transformer_blocks.26.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
498
+ "transformer_blocks.26.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
499
+ "transformer_blocks.26.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
500
+ "transformer_blocks.26.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
501
+ "transformer_blocks.26.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
502
+ "transformer_blocks.26.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
503
+ "transformer_blocks.26.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
504
+ "transformer_blocks.26.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
505
+ "transformer_blocks.26.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
506
+ "transformer_blocks.26.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
507
+ "transformer_blocks.26.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
508
+ "transformer_blocks.26.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
509
+ "transformer_blocks.26.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
510
+ "transformer_blocks.26.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
511
+ "transformer_blocks.26.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
512
+ "transformer_blocks.26.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
513
+ "transformer_blocks.26.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
514
+ "transformer_blocks.26.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
515
+ "transformer_blocks.26.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
516
+ "transformer_blocks.26.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
517
+ "transformer_blocks.26.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
518
+ "transformer_blocks.26.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
519
+ "transformer_blocks.26.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
520
+ "transformer_blocks.26.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
521
+ "transformer_blocks.27.attn1.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
522
+ "transformer_blocks.27.attn1.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
523
+ "transformer_blocks.27.attn1.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
524
+ "transformer_blocks.27.attn1.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
525
+ "transformer_blocks.27.attn1.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
526
+ "transformer_blocks.27.attn1.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
527
+ "transformer_blocks.27.attn1.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
528
+ "transformer_blocks.27.attn1.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
529
+ "transformer_blocks.27.attn1.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
530
+ "transformer_blocks.27.attn1.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
531
+ "transformer_blocks.27.attn2.norm_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
532
+ "transformer_blocks.27.attn2.norm_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
533
+ "transformer_blocks.27.attn2.to_k.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
534
+ "transformer_blocks.27.attn2.to_k.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
535
+ "transformer_blocks.27.attn2.to_out.0.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
536
+ "transformer_blocks.27.attn2.to_out.0.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
537
+ "transformer_blocks.27.attn2.to_q.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
538
+ "transformer_blocks.27.attn2.to_q.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
539
+ "transformer_blocks.27.attn2.to_v.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
540
+ "transformer_blocks.27.attn2.to_v.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
541
+ "transformer_blocks.27.ff.net.0.proj.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
542
+ "transformer_blocks.27.ff.net.0.proj.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
543
+ "transformer_blocks.27.ff.net.2.bias": "diffusion_pytorch_model-00002-of-00002.safetensors",
544
+ "transformer_blocks.27.ff.net.2.weight": "diffusion_pytorch_model-00002-of-00002.safetensors",
545
+ "transformer_blocks.27.scale_shift_table": "diffusion_pytorch_model-00002-of-00002.safetensors",
546
+ "transformer_blocks.3.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
547
+ "transformer_blocks.3.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
548
+ "transformer_blocks.3.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
549
+ "transformer_blocks.3.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
550
+ "transformer_blocks.3.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
551
+ "transformer_blocks.3.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
552
+ "transformer_blocks.3.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
553
+ "transformer_blocks.3.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
554
+ "transformer_blocks.3.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
555
+ "transformer_blocks.3.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
556
+ "transformer_blocks.3.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
557
+ "transformer_blocks.3.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
558
+ "transformer_blocks.3.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
559
+ "transformer_blocks.3.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
560
+ "transformer_blocks.3.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
561
+ "transformer_blocks.3.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
562
+ "transformer_blocks.3.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
563
+ "transformer_blocks.3.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
564
+ "transformer_blocks.3.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
565
+ "transformer_blocks.3.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
566
+ "transformer_blocks.3.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
567
+ "transformer_blocks.3.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
568
+ "transformer_blocks.3.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
569
+ "transformer_blocks.3.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
570
+ "transformer_blocks.3.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
571
+ "transformer_blocks.4.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
572
+ "transformer_blocks.4.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
573
+ "transformer_blocks.4.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
574
+ "transformer_blocks.4.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
575
+ "transformer_blocks.4.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
576
+ "transformer_blocks.4.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
577
+ "transformer_blocks.4.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
578
+ "transformer_blocks.4.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
579
+ "transformer_blocks.4.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
580
+ "transformer_blocks.4.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
581
+ "transformer_blocks.4.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
582
+ "transformer_blocks.4.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
583
+ "transformer_blocks.4.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
584
+ "transformer_blocks.4.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
585
+ "transformer_blocks.4.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
586
+ "transformer_blocks.4.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
587
+ "transformer_blocks.4.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
588
+ "transformer_blocks.4.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
589
+ "transformer_blocks.4.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
590
+ "transformer_blocks.4.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
591
+ "transformer_blocks.4.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
592
+ "transformer_blocks.4.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
593
+ "transformer_blocks.4.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
594
+ "transformer_blocks.4.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
595
+ "transformer_blocks.4.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
596
+ "transformer_blocks.5.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
597
+ "transformer_blocks.5.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
598
+ "transformer_blocks.5.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
599
+ "transformer_blocks.5.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
600
+ "transformer_blocks.5.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
601
+ "transformer_blocks.5.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
602
+ "transformer_blocks.5.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
603
+ "transformer_blocks.5.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
604
+ "transformer_blocks.5.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
605
+ "transformer_blocks.5.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
606
+ "transformer_blocks.5.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
607
+ "transformer_blocks.5.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
608
+ "transformer_blocks.5.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
609
+ "transformer_blocks.5.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
610
+ "transformer_blocks.5.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
611
+ "transformer_blocks.5.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
612
+ "transformer_blocks.5.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
613
+ "transformer_blocks.5.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
614
+ "transformer_blocks.5.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
615
+ "transformer_blocks.5.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
616
+ "transformer_blocks.5.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
617
+ "transformer_blocks.5.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
618
+ "transformer_blocks.5.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
619
+ "transformer_blocks.5.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
620
+ "transformer_blocks.5.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
621
+ "transformer_blocks.6.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
622
+ "transformer_blocks.6.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
623
+ "transformer_blocks.6.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
624
+ "transformer_blocks.6.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
625
+ "transformer_blocks.6.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
626
+ "transformer_blocks.6.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
627
+ "transformer_blocks.6.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
628
+ "transformer_blocks.6.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
629
+ "transformer_blocks.6.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
630
+ "transformer_blocks.6.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
631
+ "transformer_blocks.6.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
632
+ "transformer_blocks.6.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
633
+ "transformer_blocks.6.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
634
+ "transformer_blocks.6.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
635
+ "transformer_blocks.6.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
636
+ "transformer_blocks.6.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
637
+ "transformer_blocks.6.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
638
+ "transformer_blocks.6.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
639
+ "transformer_blocks.6.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
640
+ "transformer_blocks.6.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
641
+ "transformer_blocks.6.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
642
+ "transformer_blocks.6.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
643
+ "transformer_blocks.6.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
644
+ "transformer_blocks.6.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
645
+ "transformer_blocks.6.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
646
+ "transformer_blocks.7.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
647
+ "transformer_blocks.7.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
648
+ "transformer_blocks.7.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
649
+ "transformer_blocks.7.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
650
+ "transformer_blocks.7.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
651
+ "transformer_blocks.7.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
652
+ "transformer_blocks.7.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
653
+ "transformer_blocks.7.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
654
+ "transformer_blocks.7.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
655
+ "transformer_blocks.7.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
656
+ "transformer_blocks.7.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
657
+ "transformer_blocks.7.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
658
+ "transformer_blocks.7.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
659
+ "transformer_blocks.7.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
660
+ "transformer_blocks.7.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
661
+ "transformer_blocks.7.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
662
+ "transformer_blocks.7.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
663
+ "transformer_blocks.7.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
664
+ "transformer_blocks.7.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
665
+ "transformer_blocks.7.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
666
+ "transformer_blocks.7.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
667
+ "transformer_blocks.7.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
668
+ "transformer_blocks.7.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
669
+ "transformer_blocks.7.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
670
+ "transformer_blocks.7.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
671
+ "transformer_blocks.8.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
672
+ "transformer_blocks.8.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
673
+ "transformer_blocks.8.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
674
+ "transformer_blocks.8.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
675
+ "transformer_blocks.8.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
676
+ "transformer_blocks.8.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
677
+ "transformer_blocks.8.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
678
+ "transformer_blocks.8.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
679
+ "transformer_blocks.8.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
680
+ "transformer_blocks.8.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
681
+ "transformer_blocks.8.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
682
+ "transformer_blocks.8.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
683
+ "transformer_blocks.8.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
684
+ "transformer_blocks.8.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
685
+ "transformer_blocks.8.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
686
+ "transformer_blocks.8.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
687
+ "transformer_blocks.8.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
688
+ "transformer_blocks.8.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
689
+ "transformer_blocks.8.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
690
+ "transformer_blocks.8.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
691
+ "transformer_blocks.8.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
692
+ "transformer_blocks.8.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
693
+ "transformer_blocks.8.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
694
+ "transformer_blocks.8.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
695
+ "transformer_blocks.8.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors",
696
+ "transformer_blocks.9.attn1.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
697
+ "transformer_blocks.9.attn1.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
698
+ "transformer_blocks.9.attn1.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
699
+ "transformer_blocks.9.attn1.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
700
+ "transformer_blocks.9.attn1.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
701
+ "transformer_blocks.9.attn1.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
702
+ "transformer_blocks.9.attn1.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
703
+ "transformer_blocks.9.attn1.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
704
+ "transformer_blocks.9.attn1.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
705
+ "transformer_blocks.9.attn1.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
706
+ "transformer_blocks.9.attn2.norm_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
707
+ "transformer_blocks.9.attn2.norm_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
708
+ "transformer_blocks.9.attn2.to_k.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
709
+ "transformer_blocks.9.attn2.to_k.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
710
+ "transformer_blocks.9.attn2.to_out.0.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
711
+ "transformer_blocks.9.attn2.to_out.0.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
712
+ "transformer_blocks.9.attn2.to_q.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
713
+ "transformer_blocks.9.attn2.to_q.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
714
+ "transformer_blocks.9.attn2.to_v.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
715
+ "transformer_blocks.9.attn2.to_v.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
716
+ "transformer_blocks.9.ff.net.0.proj.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
717
+ "transformer_blocks.9.ff.net.0.proj.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
718
+ "transformer_blocks.9.ff.net.2.bias": "diffusion_pytorch_model-00001-of-00002.safetensors",
719
+ "transformer_blocks.9.ff.net.2.weight": "diffusion_pytorch_model-00001-of-00002.safetensors",
720
+ "transformer_blocks.9.scale_shift_table": "diffusion_pytorch_model-00001-of-00002.safetensors"
721
+ }
722
+ }
LTX-Video/MODEL_DIR/vae/config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "AutoencoderKLLTXVideo",
3
+ "_diffusers_version": "0.32.0.dev0",
4
+ "block_out_channels": [
5
+ 128,
6
+ 256,
7
+ 512,
8
+ 512
9
+ ],
10
+ "decoder_causal": false,
11
+ "encoder_causal": true,
12
+ "in_channels": 3,
13
+ "latent_channels": 128,
14
+ "layers_per_block": [
15
+ 4,
16
+ 3,
17
+ 3,
18
+ 3,
19
+ 4
20
+ ],
21
+ "out_channels": 3,
22
+ "patch_size": 4,
23
+ "patch_size_t": 1,
24
+ "resnet_norm_eps": 1e-06,
25
+ "scaling_factor": 1.0,
26
+ "spatio_temporal_scaling": [
27
+ true,
28
+ true,
29
+ true,
30
+ false
31
+ ]
32
+ }
LTX-Video/MODEL_DIR/vae/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:265ca87cb5dff5e37f924286e957324e282fe7710a952a7dafc0df43883e2010
3
+ size 1676798532
LTX-Video/README.md ADDED
@@ -0,0 +1,280 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ # LTX-Video
4
+
5
+ This is the official repository for LTX-Video.
6
+
7
+ [Website](https://www.lightricks.com/ltxv) |
8
+ [Model](https://huggingface.co/Lightricks/LTX-Video) |
9
+ [Demo](https://app.ltx.studio/ltx-video) |
10
+ [Paper](https://arxiv.org/abs/2501.00103)
11
+
12
+ </div>
13
+
14
+ ## Table of Contents
15
+
16
+ - [Introduction](#introduction)
17
+ - [What's new](#news)
18
+ - [Quick Start Guide](#quick-start-guide)
19
+ - [Online demo](#online-demo)
20
+ - [Run locally](#run-locally)
21
+ - [Installation](#installation)
22
+ - [Inference](#inference)
23
+ - [ComfyUI Integration](#comfyui-integration)
24
+ - [Diffusers Integration](#diffusers-integration)
25
+ - [Model User Guide](#model-user-guide)
26
+ - [Community Contribution](#community-contribution)
27
+ - [Training](#trining)
28
+ - [Join Us!](#join-us)
29
+ - [Acknowledgement](#acknowledgement)
30
+
31
+ # Introduction
32
+
33
+ LTX-Video is the first DiT-based video generation model that can generate high-quality videos in *real-time*.
34
+ It can generate 24 FPS videos at 768x512 resolution, faster than it takes to watch them.
35
+ The model is trained on a large-scale dataset of diverse videos and can generate high-resolution videos
36
+ with realistic and diverse content.
37
+
38
+ The model supports text-to-image, image-to-video, keyframe-based animation, video extension (both forward and backward), video-to-video transformations, and any combination of these features.
39
+
40
+ | | | | |
41
+ |:---:|:---:|:---:|:---:|
42
+ | ![example1](./docs/_static/ltx-video_example_00001.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with long brown hair and light skin smiles at another woman...</summary>A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage.</details> | ![example2](./docs/_static/ltx-video_example_00002.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman walks away from a white Jeep parked on a city street at night...</summary>A woman walks away from a white Jeep parked on a city street at night, then ascends a staircase and knocks on a door. The woman, wearing a dark jacket and jeans, walks away from the Jeep parked on the left side of the street, her back to the camera; she walks at a steady pace, her arms swinging slightly by her sides; the street is dimly lit, with streetlights casting pools of light on the wet pavement; a man in a dark jacket and jeans walks past the Jeep in the opposite direction; the camera follows the woman from behind as she walks up a set of stairs towards a building with a green door; she reaches the top of the stairs and turns left, continuing to walk towards the building; she reaches the door and knocks on it with her right hand; the camera remains stationary, focused on the doorway; the scene is captured in real-life footage.</details> | ![example3](./docs/_static/ltx-video_example_00003.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with blonde hair styled up, wearing a black dress...</summary>A woman with blonde hair styled up, wearing a black dress with sequins and pearl earrings, looks down with a sad expression on her face. The camera remains stationary, focused on the woman's face. The lighting is dim, casting soft shadows on her face. The scene appears to be from a movie or TV show.</details> | ![example4](./docs/_static/ltx-video_example_00004.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The camera pans over a snow-covered mountain range...</summary>The camera pans over a snow-covered mountain range, revealing a vast expanse of snow-capped peaks and valleys.The mountains are covered in a thick layer of snow, with some areas appearing almost white while others have a slightly darker, almost grayish hue. The peaks are jagged and irregular, with some rising sharply into the sky while others are more rounded. The valleys are deep and narrow, with steep slopes that are also covered in snow. The trees in the foreground are mostly bare, with only a few leaves remaining on their branches. The sky is overcast, with thick clouds obscuring the sun. The overall impression is one of peace and tranquility, with the snow-covered mountains standing as a testament to the power and beauty of nature.</details> |
43
+ | ![example5](./docs/_static/ltx-video_example_00005.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with light skin, wearing a blue jacket and a black hat...</summary>A woman with light skin, wearing a blue jacket and a black hat with a veil, looks down and to her right, then back up as she speaks; she has brown hair styled in an updo, light brown eyebrows, and is wearing a white collared shirt under her jacket; the camera remains stationary on her face as she speaks; the background is out of focus, but shows trees and people in period clothing; the scene is captured in real-life footage.</details> | ![example6](./docs/_static/ltx-video_example_00006.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man in a dimly lit room talks on a vintage telephone...</summary>A man in a dimly lit room talks on a vintage telephone, hangs up, and looks down with a sad expression. He holds the black rotary phone to his right ear with his right hand, his left hand holding a rocks glass with amber liquid. He wears a brown suit jacket over a white shirt, and a gold ring on his left ring finger. His short hair is neatly combed, and he has light skin with visible wrinkles around his eyes. The camera remains stationary, focused on his face and upper body. The room is dark, lit only by a warm light source off-screen to the left, casting shadows on the wall behind him. The scene appears to be from a movie.</details> | ![example7](./docs/_static/ltx-video_example_00007.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A prison guard unlocks and opens a cell door...</summary>A prison guard unlocks and opens a cell door to reveal a young man sitting at a table with a woman. The guard, wearing a dark blue uniform with a badge on his left chest, unlocks the cell door with a key held in his right hand and pulls it open; he has short brown hair, light skin, and a neutral expression. The young man, wearing a black and white striped shirt, sits at a table covered with a white tablecloth, facing the woman; he has short brown hair, light skin, and a neutral expression. The woman, wearing a dark blue shirt, sits opposite the young man, her face turned towards him; she has short blonde hair and light skin. The camera remains stationary, capturing the scene from a medium distance, positioned slightly to the right of the guard. The room is dimly lit, with a single light fixture illuminating the table and the two figures. The walls are made of large, grey concrete blocks, and a metal door is visible in the background. The scene is captured in real-life footage.</details> | ![example8](./docs/_static/ltx-video_example_00008.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with blood on her face and a white tank top...</summary>A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.</details> |
44
+ | ![example9](./docs/_static/ltx-video_example_00009.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man with graying hair, a beard, and a gray shirt...</summary>A man with graying hair, a beard, and a gray shirt looks down and to his right, then turns his head to the left. The camera angle is a close-up, focused on the man's face. The lighting is dim, with a greenish tint. The scene appears to be real-life footage. Step</details> | ![example10](./docs/_static/ltx-video_example_00010.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A clear, turquoise river flows through a rocky canyon...</summary>A clear, turquoise river flows through a rocky canyon, cascading over a small waterfall and forming a pool of water at the bottom.The river is the main focus of the scene, with its clear water reflecting the surrounding trees and rocks. The canyon walls are steep and rocky, with some vegetation growing on them. The trees are mostly pine trees, with their green needles contrasting with the brown and gray rocks. The overall tone of the scene is one of peace and tranquility.</details> | ![example11](./docs/_static/ltx-video_example_00011.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man in a suit enters a room and speaks to two women...</summary>A man in a suit enters a room and speaks to two women sitting on a couch. The man, wearing a dark suit with a gold tie, enters the room from the left and walks towards the center of the frame. He has short gray hair, light skin, and a serious expression. He places his right hand on the back of a chair as he approaches the couch. Two women are seated on a light-colored couch in the background. The woman on the left wears a light blue sweater and has short blonde hair. The woman on the right wears a white sweater and has short blonde hair. The camera remains stationary, focusing on the man as he enters the room. The room is brightly lit, with warm tones reflecting off the walls and furniture. The scene appears to be from a film or television show.</details> | ![example12](./docs/_static/ltx-video_example_00012.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The waves crash against the jagged rocks of the shoreline...</summary>The waves crash against the jagged rocks of the shoreline, sending spray high into the air.The rocks are a dark gray color, with sharp edges and deep crevices. The water is a clear blue-green, with white foam where the waves break against the rocks. The sky is a light gray, with a few white clouds dotting the horizon.</details> |
45
+ | ![example13](./docs/_static/ltx-video_example_00013.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The camera pans across a cityscape of tall buildings...</summary>The camera pans across a cityscape of tall buildings with a circular building in the center. The camera moves from left to right, showing the tops of the buildings and the circular building in the center. The buildings are various shades of gray and white, and the circular building has a green roof. The camera angle is high, looking down at the city. The lighting is bright, with the sun shining from the upper left, casting shadows from the buildings. The scene is computer-generated imagery.</details> | ![example14](./docs/_static/ltx-video_example_00014.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man walks towards a window, looks out, and then turns around...</summary>A man walks towards a window, looks out, and then turns around. He has short, dark hair, dark skin, and is wearing a brown coat over a red and gray scarf. He walks from left to right towards a window, his gaze fixed on something outside. The camera follows him from behind at a medium distance. The room is brightly lit, with white walls and a large window covered by a white curtain. As he approaches the window, he turns his head slightly to the left, then back to the right. He then turns his entire body to the right, facing the window. The camera remains stationary as he stands in front of the window. The scene is captured in real-life footage.</details> | ![example15](./docs/_static/ltx-video_example_00015.gif)<br><details style="max-width: 300px; margin: auto;"><summary>Two police officers in dark blue uniforms and matching hats...</summary>Two police officers in dark blue uniforms and matching hats enter a dimly lit room through a doorway on the left side of the frame. The first officer, with short brown hair and a mustache, steps inside first, followed by his partner, who has a shaved head and a goatee. Both officers have serious expressions and maintain a steady pace as they move deeper into the room. The camera remains stationary, capturing them from a slightly low angle as they enter. The room has exposed brick walls and a corrugated metal ceiling, with a barred window visible in the background. The lighting is low-key, casting shadows on the officers' faces and emphasizing the grim atmosphere. The scene appears to be from a film or television show.</details> | ![example16](./docs/_static/ltx-video_example_00016.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with short brown hair, wearing a maroon sleeveless top...</summary>A woman with short brown hair, wearing a maroon sleeveless top and a silver necklace, walks through a room while talking, then a woman with pink hair and a white shirt appears in the doorway and yells. The first woman walks from left to right, her expression serious; she has light skin and her eyebrows are slightly furrowed. The second woman stands in the doorway, her mouth open in a yell; she has light skin and her eyes are wide. The room is dimly lit, with a bookshelf visible in the background. The camera follows the first woman as she walks, then cuts to a close-up of the second woman's face. The scene is captured in real-life footage.</details> |
46
+
47
+ # News
48
+
49
+ ## March, 5th, 2025: New checkpoint v0.9.5
50
+ - New license for commercial use ([OpenRail-M](https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.5.license.txt))
51
+ - Release a new checkpoint v0.9.5 with improved quality
52
+ - Support keyframes and video extension
53
+ - Support higher resolutions
54
+ - Improved prompt understanding
55
+ - Improved VAE
56
+ - New online web app in [LTX-Studio](https://app.ltx.studio/ltx-video)
57
+ - Automatic prompt enhancement
58
+
59
+ ## February, 20th, 2025: More inference options
60
+ - Improve STG (Spatiotemporal Guidance) for LTX-Video
61
+ - Support MPS on macOS with PyTorch 2.3.0
62
+ - Add support for 8-bit model, LTX-VideoQ8
63
+ - Add TeaCache for LTX-Video
64
+ - Add [ComfyUI-LTXTricks](#comfyui-integration)
65
+ - Add Diffusion-Pipe
66
+
67
+ ## December 31st, 2024: Research paper
68
+ - Release the [research paper](https://arxiv.org/abs/2501.00103)
69
+
70
+ ## December 20th, 2024: New checkpoint v0.9.1
71
+ - Release a new checkpoint v0.9.1 with improved quality
72
+ - Support for STG / PAG
73
+ - Support loading checkpoints of LTX-Video in Diffusers format (conversion is done on-the-fly)
74
+ - Support offloading unused parts to CPU
75
+ - Support the new timestep-conditioned VAE decoder
76
+ - Reference contributions from the community in the readme file
77
+ - Relax transformers dependency
78
+
79
+ ## November 21th, 2024: Initial release v0.9.0
80
+ - Initial release of LTX-Video
81
+ - Support text-to-video and image-to-video generation
82
+
83
+ # Quick Start Guide
84
+
85
+ ## Online inference
86
+ The model is accessible right away via the following links:
87
+ - [LTX-Studio image-to-video](https://app.ltx.studio/ltx-video)
88
+ - [Fal.ai text-to-video](https://fal.ai/models/fal-ai/ltx-video)
89
+ - [Fal.ai image-to-video](https://fal.ai/models/fal-ai/ltx-video/image-to-video)
90
+ - [Replicate text-to-video and image-to-video](https://replicate.com/lightricks/ltx-video)
91
+
92
+ ## Run locally
93
+
94
+ ### Installation
95
+ The codebase was tested with Python 3.10.5, CUDA version 12.2, and supports PyTorch >= 2.1.2.
96
+ On macos, MPS was tested with PyTorch 2.3.0, and should support PyTorch == 2.3 or >= 2.6.
97
+
98
+ ```bash
99
+ git clone https://github.com/Lightricks/LTX-Video.git
100
+ cd LTX-Video
101
+
102
+ # create env
103
+ python -m venv env
104
+ source env/bin/activate
105
+ python -m pip install -e .\[inference-script\]
106
+ ```
107
+
108
+ Then, download the model from [Hugging Face](https://huggingface.co/Lightricks/LTX-Video)
109
+
110
+ ```python
111
+ from huggingface_hub import hf_hub_download
112
+
113
+ model_dir = 'MODEL_DIR' # The local directory to save downloaded checkpoint
114
+ hf_hub_download(repo_id="Lightricks/LTX-Video", filename="ltx-video-2b-v0.9.5.safetensors", local_dir=model_dir, local_dir_use_symlinks=False, repo_type='model')
115
+ ```
116
+
117
+ ### Inference
118
+
119
+ To use our model, please follow the inference code in [inference.py](./inference.py):
120
+
121
+ #### For text-to-video generation:
122
+
123
+ ```bash
124
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
125
+ ```
126
+
127
+ #### For image-to-video generation:
128
+
129
+ ```bash
130
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths IMAGE_PATH --conditioning_start_frames 0 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
131
+ ```
132
+
133
+ #### Extending a video:
134
+
135
+ ๐Ÿ“ **Note:** Input video segments must contain a multiple of 8 frames plus 1 (e.g., 9, 17, 25, etc.), and the target frame number should be a multiple of 8.
136
+
137
+
138
+ ```bash
139
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths VIDEO_PATH --conditioning_start_frames START_FRAME --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
140
+ ```
141
+
142
+ #### For video generation with multiple conditions:
143
+
144
+ You can now generate a video conditioned on a set of images and/or short video segments.
145
+ Simply provide a list of paths to the images or video segments you want to condition on, along with their target frame numbers in the generated video. You can also specify the conditioning strength for each item (default: 1.0).
146
+
147
+ ```bash
148
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths IMAGE_OR_VIDEO_PATH_1 IMAGE_OR_VIDEO_PATH_2 --conditioning_start_frames TARGET_FRAME_1 TARGET_FRAME_2 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
149
+ ```
150
+
151
+ ## ComfyUI Integration
152
+ To use our model with ComfyUI, please follow the instructions at [https://github.com/Lightricks/ComfyUI-LTXVideo/](https://github.com/Lightricks/ComfyUI-LTXVideo/).
153
+
154
+ ## Diffusers Integration
155
+ To use our model with the Diffusers Python library, check out the [official documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
156
+
157
+ Diffusers also support an 8-bit version of LTX-Video, [see details below](#ltx-videoq8)
158
+
159
+ # Model User Guide
160
+
161
+ ## ๐Ÿ“ Prompt Engineering
162
+
163
+ When writing prompts, focus on detailed, chronological descriptions of actions and scenes. Include specific movements, appearances, camera angles, and environmental details - all in a single flowing paragraph. Start directly with the action, and keep descriptions literal and precise. Think like a cinematographer describing a shot list. Keep within 200 words. For best results, build your prompts using this structure:
164
+
165
+ * Start with main action in a single sentence
166
+ * Add specific details about movements and gestures
167
+ * Describe character/object appearances precisely
168
+ * Include background and environment details
169
+ * Specify camera angles and movements
170
+ * Describe lighting and colors
171
+ * Note any changes or sudden events
172
+ * See [examples](#introduction) for more inspiration.
173
+
174
+ ### Automatic Prompt Enhancement
175
+ When using `inference.py`, shorts prompts (below `prompt_enhancement_words_threshold` words) are automatically enhanced by a language model. This is supported with text-to-video and image-to-video (first-frame conditioning).
176
+
177
+ When using `LTXVideoPipeline` directly, you can enable prompt enhancement by setting `enhance_prompt=True`.
178
+
179
+ ## ๐ŸŽฎ Parameter Guide
180
+
181
+ * Resolution Preset: Higher resolutions for detailed scenes, lower for faster generation and simpler scenes. The model works on resolutions that are divisible by 32 and number of frames that are divisible by 8 + 1 (e.g. 257). In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input will be padded with -1 and then cropped to the desired resolution and number of frames. The model works best on resolutions under 720 x 1280 and number of frames below 257
182
+ * Seed: Save seed values to recreate specific styles or compositions you like
183
+ * Guidance Scale: 3-3.5 are the recommended values
184
+ * Inference Steps: More steps (40+) for quality, fewer steps (20-30) for speed
185
+
186
+ ๐Ÿ“ For advanced parameters usage, please see `python inference.py --help`
187
+
188
+ ## Community Contribution
189
+
190
+ ### ComfyUI-LTXTricks ๐Ÿ› ๏ธ
191
+
192
+ A community project providing additional nodes for enhanced control over the LTX Video model. It includes implementations of advanced techniques like RF-Inversion, RF-Edit, FlowEdit, and more. These nodes enable workflows such as Image and Video to Video (I+V2V), enhanced sampling via Spatiotemporal Skip Guidance (STG), and interpolation with precise frame settings.
193
+
194
+ - **Repository:** [ComfyUI-LTXTricks](https://github.com/logtd/ComfyUI-LTXTricks)
195
+ - **Features:**
196
+ - ๐Ÿ”„ **RF-Inversion:** Implements [RF-Inversion](https://rf-inversion.github.io/) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_inversion.json).
197
+ - โœ‚๏ธ **RF-Edit:** Implements [RF-Solver-Edit](https://github.com/wangjiangshan0725/RF-Solver-Edit) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_rf_edit.json).
198
+ - ๐ŸŒŠ **FlowEdit:** Implements [FlowEdit](https://github.com/fallenshock/FlowEdit) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_flow_edit.json).
199
+ - ๐ŸŽฅ **I+V2V:** Enables Video to Video with a reference image. [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_iv2v.json).
200
+ - โœจ **Enhance:** Partial implementation of [STGuidance](https://junhahyung.github.io/STGuidance/). [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltxv_stg.json).
201
+ - ๐Ÿ–ผ๏ธ **Interpolation and Frame Setting:** Nodes for precise control of latents per frame. [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_interpolation.json).
202
+
203
+
204
+ ### LTX-VideoQ8 ๐ŸŽฑ <a id="ltx-videoq8"></a>
205
+
206
+ **LTX-VideoQ8** is an 8-bit optimized version of [LTX-Video](https://github.com/Lightricks/LTX-Video), designed for faster performance on NVIDIA ADA GPUs.
207
+
208
+ - **Repository:** [LTX-VideoQ8](https://github.com/KONAKONA666/LTX-Video)
209
+ - **Features:**
210
+ - ๐Ÿš€ Up to 3X speed-up with no accuracy loss
211
+ - ๐ŸŽฅ Generate 720x480x121 videos in under a minute on RTX 4060 (8GB VRAM)
212
+ - ๐Ÿ› ๏ธ Fine-tune 2B transformer models with precalculated latents
213
+ - **Community Discussion:** [Reddit Thread](https://www.reddit.com/r/StableDiffusion/comments/1h79ks2/fast_ltx_video_on_rtx_4060_and_other_ada_gpus/)
214
+ - **Diffusers integration:** A diffusers integration for the 8-bit model is already out! [Details here](https://github.com/sayakpaul/q8-ltx-video)
215
+
216
+
217
+ ### TeaCache for LTX-Video ๐Ÿต <a id="TeaCache"></a>
218
+
219
+ **TeaCache** is a training-free caching approach that leverages timestep differences across model outputs to accelerate LTX-Video inference by up to 2x without significant visual quality degradation.
220
+
221
+ - **Repository:** [TeaCache4LTX-Video](https://github.com/ali-vilab/TeaCache/tree/main/TeaCache4LTX-Video)
222
+ - **Features:**
223
+ - ๐Ÿš€ Speeds up LTX-Video inference.
224
+ - ๐Ÿ“Š Adjustable trade-offs between speed (up to 2x) and visual quality using configurable parameters.
225
+ - ๐Ÿ› ๏ธ No retraining required: Works directly with existing models.
226
+
227
+ ### Your Contribution
228
+
229
+ ...is welcome! If you have a project or tool that integrates with LTX-Video,
230
+ please let us know by opening an issue or pull request.
231
+
232
+ # Training
233
+
234
+ ## Diffusers
235
+
236
+ Diffusers implemented [LoRA support](https://github.com/huggingface/diffusers/pull/10228),
237
+ with a training script for fine-tuning.
238
+ More information and training script in
239
+ [finetrainers](https://github.com/a-r-r-o-w/finetrainers?tab=readme-ov-file#training).
240
+
241
+ ## Diffusion-Pipe
242
+
243
+ An experimental training framework with pipeline parallelism, enabling fine-tuning of large models like **LTX-Video** across multiple GPUs.
244
+
245
+ - **Repository:** [Diffusion-Pipe](https://github.com/tdrussell/diffusion-pipe)
246
+ - **Features:**
247
+ - ๐Ÿ› ๏ธ Full fine-tune support for LTX-Video using LoRA
248
+ - ๐Ÿ“Š Useful metrics logged to Tensorboard
249
+ - ๐Ÿ”„ Training state checkpointing and resumption
250
+ - โšก Efficient pre-caching of latents and text embeddings for multi-GPU setups
251
+
252
+
253
+ # Join Us ๐Ÿš€
254
+
255
+ Want to work on cutting-edge AI research and make a real impact on millions of users worldwide?
256
+
257
+ At **Lightricks**, an AI-first company, we're revolutionizing how visual content is created.
258
+
259
+ If you are passionate about AI, computer vision, and video generation, we would love to hear from you!
260
+
261
+ Please visit our [careers page](https://careers.lightricks.com/careers?query=&office=all&department=R%26D) for more information.
262
+
263
+ # Acknowledgement
264
+
265
+ We are grateful for the following awesome projects when implementing LTX-Video:
266
+ * [DiT](https://github.com/facebookresearch/DiT) and [PixArt-alpha](https://github.com/PixArt-alpha/PixArt-alpha): vision transformers for image generation.
267
+
268
+
269
+ ## Citation
270
+
271
+ ๐Ÿ“„ Our tech report is out! If you find our work helpful, please โญ๏ธ star the repository and cite our paper.
272
+
273
+ ```
274
+ @article{HaCohen2024LTXVideo,
275
+ title={LTX-Video: Realtime Video Latent Diffusion},
276
+ author={HaCohen, Yoav and Chiprut, Nisan and Brazowski, Benny and Shalem, Daniel and Moshe, Dudu and Richardson, Eitan and Levin, Eran and Shiran, Guy and Zabari, Nir and Gordon, Ori and Panet, Poriya and Weissbuch, Sapir and Kulikov, Victor and Bitterman, Yaki and Melumian, Zeev and Bibi, Ofir},
277
+ journal={arXiv preprint arXiv:2501.00103},
278
+ year={2024}
279
+ }
280
+ ```
LTX-Video/__init__.py ADDED
File without changes
LTX-Video/docs/_static/ltx-video_example_00001.gif ADDED

Git LFS Details

  • SHA256: b679f14a09d2321b7e34b3ecd23bc01c2cfa75c8d4214a1e59af09826003e2ec
  • Pointer size: 132 Bytes
  • Size of remote file: 7.96 MB
LTX-Video/docs/_static/ltx-video_example_00002.gif ADDED

Git LFS Details

  • SHA256: 336f4baec79c1bd754c7c1bf3ac0792910cc85b6a3bde15fabeb0fb0f33299ff
  • Pointer size: 132 Bytes
  • Size of remote file: 7.9 MB
LTX-Video/docs/_static/ltx-video_example_00003.gif ADDED

Git LFS Details

  • SHA256: ab2cb063b872d487fbbab821de7fe8157e7f87af03bd780d55116cb98fc8fc45
  • Pointer size: 132 Bytes
  • Size of remote file: 4.43 MB
LTX-Video/docs/_static/ltx-video_example_00004.gif ADDED

Git LFS Details

  • SHA256: 0a599a641cc3367fab5a6dd75fc89be63208cc708a1173b2ce7bfeac7208f831
  • Pointer size: 132 Bytes
  • Size of remote file: 6.71 MB
LTX-Video/docs/_static/ltx-video_example_00005.gif ADDED

Git LFS Details

  • SHA256: 87fdb9556c1218db4b929994e9b807d1d63f4676defef5b418a4edb1ddaa8422
  • Pointer size: 132 Bytes
  • Size of remote file: 5.73 MB
LTX-Video/docs/_static/ltx-video_example_00006.gif ADDED

Git LFS Details

  • SHA256: f56f3dcc84a871ab4ef1510120f7a4586c7044c5609a897d8177ae8d52eb3eae
  • Pointer size: 132 Bytes
  • Size of remote file: 4.24 MB
LTX-Video/docs/_static/ltx-video_example_00007.gif ADDED

Git LFS Details

  • SHA256: a08a06681334856db516e969a9ae4290acfd7550f7b970331e87d0223e282bcc
  • Pointer size: 132 Bytes
  • Size of remote file: 7.83 MB
LTX-Video/docs/_static/ltx-video_example_00008.gif ADDED

Git LFS Details

  • SHA256: 3242c65e11a40177c91b48d8ee18084dc4f907ffe5f11217c5f3e5aa2ca3fe36
  • Pointer size: 132 Bytes
  • Size of remote file: 6.23 MB
LTX-Video/docs/_static/ltx-video_example_00009.gif ADDED

Git LFS Details

  • SHA256: aa1e0a2ba75c6bda530a798e8aaeb3edc19413970b99d2a67b79839cd14f2fe5
  • Pointer size: 132 Bytes
  • Size of remote file: 6.39 MB
LTX-Video/docs/_static/ltx-video_example_00010.gif ADDED

Git LFS Details

  • SHA256: bcf1e084e936a75eaae73a29f60935c469b1fc34eb3f5ad89483e88b3a2eaffe
  • Pointer size: 132 Bytes
  • Size of remote file: 6.19 MB
LTX-Video/docs/_static/ltx-video_example_00011.gif ADDED

Git LFS Details

  • SHA256: 3e3d04f5763ecb416b3b80c3488e48c49991d80661c94e8f08dddd7b890b1b75
  • Pointer size: 132 Bytes
  • Size of remote file: 5.35 MB
LTX-Video/docs/_static/ltx-video_example_00012.gif ADDED

Git LFS Details

  • SHA256: 39790832fd9bff62c99a799eb4843cf99c9ab73c3f181656acbbd0d4ebf7f471
  • Pointer size: 132 Bytes
  • Size of remote file: 7.47 MB
LTX-Video/docs/_static/ltx-video_example_00013.gif ADDED

Git LFS Details

  • SHA256: aa7eb790b43f8a55c01d1fbed4c7a7f657fb2ca78a9685833cf9cb558d2002c1
  • Pointer size: 132 Bytes
  • Size of remote file: 9.02 MB
LTX-Video/docs/_static/ltx-video_example_00014.gif ADDED

Git LFS Details

  • SHA256: 4f7afc4b498a927dcc4e1492548db5c32fa76d117e0410d11e1e0b1929153e54
  • Pointer size: 132 Bytes
  • Size of remote file: 7.43 MB
LTX-Video/docs/_static/ltx-video_example_00015.gif ADDED

Git LFS Details

  • SHA256: d897c9656e0cba89512ab9d2cbe2d2c0f2ddf907dcab5f7eadab4b96b1cb1efe
  • Pointer size: 132 Bytes
  • Size of remote file: 6.56 MB
LTX-Video/docs/_static/ltx-video_example_00016.gif ADDED

Git LFS Details

  • SHA256: c74f35e37bba01817ca4ac01dd9195863100eb83e7cb73bbea2b53e0f69a8628
  • Pointer size: 132 Bytes
  • Size of remote file: 7.41 MB
LTX-Video/file_list.txt ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/.gitattributes
2
+ out=MODEL_DIR/.gitattributes
3
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/README.md
4
+ out=MODEL_DIR/README.md
5
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/ltx-video-2b-v0.9.5.safetensors
6
+ out=MODEL_DIR/ltx-video-2b-v0.9.5.safetensors
7
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/model_index.json
8
+ out=MODEL_DIR/model_index.json
9
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/scheduler/scheduler_config.json
10
+ out=MODEL_DIR/scheduler/scheduler_config.json
11
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/t5xxl_fp16.safetensors
12
+ out=MODEL_DIR/t5xxl_fp16.safetensors
13
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/t5xxl_fp8_e4m3fn_scaled.safetensors
14
+ out=MODEL_DIR/t5xxl_fp8_e4m3fn_scaled.safetensors
15
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/config.json
16
+ out=MODEL_DIR/text_encoder/config.json
17
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/model-00001-of-00004.safetensors
18
+ out=MODEL_DIR/text_encoder/model-00001-of-00004.safetensors
19
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/model-00002-of-00004.safetensors
20
+ out=MODEL_DIR/text_encoder/model-00002-of-00004.safetensors
21
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/model-00003-of-00004.safetensors
22
+ out=MODEL_DIR/text_encoder/model-00003-of-00004.safetensors
23
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/model-00004-of-00004.safetensors
24
+ out=MODEL_DIR/text_encoder/model-00004-of-00004.safetensors
25
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/text_encoder/model.safetensors.index.json
26
+ out=MODEL_DIR/text_encoder/model.safetensors.index.json
27
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/tokenizer/added_tokens.json
28
+ out=MODEL_DIR/tokenizer/added_tokens.json
29
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/tokenizer/special_tokens_map.json
30
+ out=MODEL_DIR/tokenizer/special_tokens_map.json
31
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/tokenizer/spiece.model
32
+ out=MODEL_DIR/tokenizer/spiece.model
33
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/tokenizer/tokenizer_config.json
34
+ out=MODEL_DIR/tokenizer/tokenizer_config.json
35
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/transformer/config.json
36
+ out=MODEL_DIR/transformer/config.json
37
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/transformer/diffusion_pytorch_model-00001-of-00002.safetensors
38
+ out=MODEL_DIR/transformer/diffusion_pytorch_model-00001-of-00002.safetensors
39
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/transformer/diffusion_pytorch_model-00002-of-00002.safetensors
40
+ out=MODEL_DIR/transformer/diffusion_pytorch_model-00002-of-00002.safetensors
41
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/transformer/diffusion_pytorch_model.safetensors.index.json
42
+ out=MODEL_DIR/transformer/diffusion_pytorch_model.safetensors.index.json
43
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/vae/config.json
44
+ out=MODEL_DIR/vae/config.json
45
+ https://huggingface.co/Isi99999/LTX-Video/resolve/main/vae/diffusion_pytorch_model.safetensors
46
+ out=MODEL_DIR/vae/diffusion_pytorch_model.safetensors
LTX-Video/inference.py ADDED
@@ -0,0 +1,758 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import os
3
+ import random
4
+ from datetime import datetime
5
+ from pathlib import Path
6
+ from diffusers.utils import logging
7
+ from typing import Optional, List, Union
8
+
9
+ import imageio
10
+ import numpy as np
11
+ import torch
12
+ from PIL import Image
13
+ from transformers import (
14
+ T5EncoderModel,
15
+ T5Tokenizer,
16
+ AutoModelForCausalLM,
17
+ AutoProcessor,
18
+ AutoTokenizer,
19
+ )
20
+
21
+ from ltx_video.models.autoencoders.causal_video_autoencoder import (
22
+ CausalVideoAutoencoder,
23
+ )
24
+ from ltx_video.models.transformers.symmetric_patchifier import SymmetricPatchifier
25
+ from ltx_video.models.transformers.transformer3d import Transformer3DModel
26
+ from ltx_video.pipelines.pipeline_ltx_video import ConditioningItem, LTXVideoPipeline
27
+ from ltx_video.schedulers.rf import RectifiedFlowScheduler
28
+ from ltx_video.utils.skip_layer_strategy import SkipLayerStrategy
29
+
30
+ MAX_HEIGHT = 720
31
+ MAX_WIDTH = 1280
32
+ MAX_NUM_FRAMES = 257
33
+
34
+ logger = logging.get_logger("LTX-Video")
35
+
36
+
37
+ def get_total_gpu_memory():
38
+ if torch.cuda.is_available():
39
+ total_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3)
40
+ return total_memory
41
+ return 0
42
+
43
+
44
+ def get_device():
45
+ if torch.cuda.is_available():
46
+ return "cuda"
47
+ elif torch.backends.mps.is_available():
48
+ return "mps"
49
+ return "cpu"
50
+
51
+
52
+ def load_image_to_tensor_with_resize_and_crop(
53
+ image_input: Union[str, Image.Image],
54
+ target_height: int = 512,
55
+ target_width: int = 768,
56
+ ) -> torch.Tensor:
57
+ """Load and process an image into a tensor.
58
+
59
+ Args:
60
+ image_input: Either a file path (str) or a PIL Image object
61
+ target_height: Desired height of output tensor
62
+ target_width: Desired width of output tensor
63
+ """
64
+ if isinstance(image_input, str):
65
+ image = Image.open(image_input).convert("RGB")
66
+ elif isinstance(image_input, Image.Image):
67
+ image = image_input
68
+ else:
69
+ raise ValueError("image_input must be either a file path or a PIL Image object")
70
+
71
+ input_width, input_height = image.size
72
+ aspect_ratio_target = target_width / target_height
73
+ aspect_ratio_frame = input_width / input_height
74
+ if aspect_ratio_frame > aspect_ratio_target:
75
+ new_width = int(input_height * aspect_ratio_target)
76
+ new_height = input_height
77
+ x_start = (input_width - new_width) // 2
78
+ y_start = 0
79
+ else:
80
+ new_width = input_width
81
+ new_height = int(input_width / aspect_ratio_target)
82
+ x_start = 0
83
+ y_start = (input_height - new_height) // 2
84
+
85
+ image = image.crop((x_start, y_start, x_start + new_width, y_start + new_height))
86
+ image = image.resize((target_width, target_height))
87
+ frame_tensor = torch.tensor(np.array(image)).permute(2, 0, 1).float()
88
+ frame_tensor = (frame_tensor / 127.5) - 1.0
89
+ # Create 5D tensor: (batch_size=1, channels=3, num_frames=1, height, width)
90
+ return frame_tensor.unsqueeze(0).unsqueeze(2)
91
+
92
+
93
+ def calculate_padding(
94
+ source_height: int, source_width: int, target_height: int, target_width: int
95
+ ) -> tuple[int, int, int, int]:
96
+
97
+ # Calculate total padding needed
98
+ pad_height = target_height - source_height
99
+ pad_width = target_width - source_width
100
+
101
+ # Calculate padding for each side
102
+ pad_top = pad_height // 2
103
+ pad_bottom = pad_height - pad_top # Handles odd padding
104
+ pad_left = pad_width // 2
105
+ pad_right = pad_width - pad_left # Handles odd padding
106
+
107
+ # Return padded tensor
108
+ # Padding format is (left, right, top, bottom)
109
+ padding = (pad_left, pad_right, pad_top, pad_bottom)
110
+ return padding
111
+
112
+
113
+ def convert_prompt_to_filename(text: str, max_len: int = 20) -> str:
114
+ # Remove non-letters and convert to lowercase
115
+ clean_text = "".join(
116
+ char.lower() for char in text if char.isalpha() or char.isspace()
117
+ )
118
+
119
+ # Split into words
120
+ words = clean_text.split()
121
+
122
+ # Build result string keeping track of length
123
+ result = []
124
+ current_length = 0
125
+
126
+ for word in words:
127
+ # Add word length plus 1 for underscore (except for first word)
128
+ new_length = current_length + len(word)
129
+
130
+ if new_length <= max_len:
131
+ result.append(word)
132
+ current_length += len(word)
133
+ else:
134
+ break
135
+
136
+ return "-".join(result)
137
+
138
+
139
+ # Generate output video name
140
+ def get_unique_filename(
141
+ base: str,
142
+ ext: str,
143
+ prompt: str,
144
+ seed: int,
145
+ resolution: tuple[int, int, int],
146
+ dir: Path,
147
+ endswith=None,
148
+ index_range=1000,
149
+ ) -> Path:
150
+ base_filename = f"{base}_{convert_prompt_to_filename(prompt, max_len=30)}_{seed}_{resolution[0]}x{resolution[1]}x{resolution[2]}"
151
+ for i in range(index_range):
152
+ filename = dir / f"{base_filename}_{i}{endswith if endswith else ''}{ext}"
153
+ if not os.path.exists(filename):
154
+ return filename
155
+ raise FileExistsError(
156
+ f"Could not find a unique filename after {index_range} attempts."
157
+ )
158
+
159
+
160
+ def seed_everething(seed: int):
161
+ random.seed(seed)
162
+ np.random.seed(seed)
163
+ torch.manual_seed(seed)
164
+ if torch.cuda.is_available():
165
+ torch.cuda.manual_seed(seed)
166
+ if torch.backends.mps.is_available():
167
+ torch.mps.manual_seed(seed)
168
+
169
+
170
+ def main():
171
+ parser = argparse.ArgumentParser(
172
+ description="Load models from separate directories and run the pipeline."
173
+ )
174
+
175
+ # Directories
176
+ parser.add_argument(
177
+ "--ckpt_path",
178
+ type=str,
179
+ required=True,
180
+ help="Path to a safetensors file that contains all model parts.",
181
+ )
182
+ parser.add_argument(
183
+ "--output_path",
184
+ type=str,
185
+ default=None,
186
+ help="Path to the folder to save output video, if None will save in outputs/ directory.",
187
+ )
188
+ parser.add_argument("--seed", type=int, default="171198")
189
+
190
+ # Pipeline parameters
191
+ parser.add_argument(
192
+ "--num_inference_steps", type=int, default=40, help="Number of inference steps"
193
+ )
194
+ parser.add_argument(
195
+ "--num_images_per_prompt",
196
+ type=int,
197
+ default=1,
198
+ help="Number of images per prompt",
199
+ )
200
+ parser.add_argument(
201
+ "--guidance_scale",
202
+ type=float,
203
+ default=3,
204
+ help="Guidance scale.",
205
+ )
206
+ parser.add_argument(
207
+ "--stg_scale",
208
+ type=float,
209
+ default=1,
210
+ help="Spatiotemporal guidance scale. 0 to disable STG.",
211
+ )
212
+ parser.add_argument(
213
+ "--stg_rescale",
214
+ type=float,
215
+ default=0.7,
216
+ help="Spatiotemporal guidance rescaling scale. 1 to disable rescale.",
217
+ )
218
+ parser.add_argument(
219
+ "--stg_mode",
220
+ type=str,
221
+ default="attention_values",
222
+ help="Spatiotemporal guidance mode. "
223
+ "It can be one of 'attention_values' (default), 'attension_skip', 'residual', or 'transformer_block'.",
224
+ )
225
+ parser.add_argument(
226
+ "--stg_skip_layers",
227
+ type=str,
228
+ default="19",
229
+ help="Layers to block for spatiotemporal guidance. Comma separated list of integers.",
230
+ )
231
+ parser.add_argument(
232
+ "--image_cond_noise_scale",
233
+ type=float,
234
+ default=0.15,
235
+ help="Amount of noise to add to the conditioned image",
236
+ )
237
+ parser.add_argument(
238
+ "--height",
239
+ type=int,
240
+ default=480,
241
+ help="Height of the output video frames. Optional if an input image provided.",
242
+ )
243
+ parser.add_argument(
244
+ "--width",
245
+ type=int,
246
+ default=704,
247
+ help="Width of the output video frames. If None will infer from input image.",
248
+ )
249
+ parser.add_argument(
250
+ "--num_frames",
251
+ type=int,
252
+ default=121,
253
+ help="Number of frames to generate in the output video",
254
+ )
255
+ parser.add_argument(
256
+ "--frame_rate", type=int, default=25, help="Frame rate for the output video"
257
+ )
258
+ parser.add_argument(
259
+ "--device",
260
+ default=None,
261
+ help="Device to run inference on. If not specified, will automatically detect and use CUDA or MPS if available, else CPU.",
262
+ )
263
+ parser.add_argument(
264
+ "--precision",
265
+ choices=["bfloat16", "mixed_precision"],
266
+ default="bfloat16",
267
+ help="Sets the precision for the transformer and tokenizer. Default is bfloat16. If 'mixed_precision' is enabled, it moves to mixed-precision.",
268
+ )
269
+
270
+ # VAE noise augmentation
271
+ parser.add_argument(
272
+ "--decode_timestep",
273
+ type=float,
274
+ default=0.025,
275
+ help="Timestep for decoding noise",
276
+ )
277
+ parser.add_argument(
278
+ "--decode_noise_scale",
279
+ type=float,
280
+ default=0.0125,
281
+ help="Noise level for decoding noise",
282
+ )
283
+
284
+ # Prompts
285
+ parser.add_argument(
286
+ "--prompt",
287
+ type=str,
288
+ help="Text prompt to guide generation",
289
+ )
290
+ parser.add_argument(
291
+ "--negative_prompt",
292
+ type=str,
293
+ default="worst quality, inconsistent motion, blurry, jittery, distorted",
294
+ help="Negative prompt for undesired features",
295
+ )
296
+
297
+ parser.add_argument(
298
+ "--low_vram",
299
+ action="store_true",
300
+ )
301
+
302
+ parser.add_argument(
303
+ "--offload_to_cpu",
304
+ action="store_true",
305
+ help="Offloading unnecessary computations to CPU.",
306
+ )
307
+
308
+ parser.add_argument(
309
+ "--text_encoder_model_name_or_path",
310
+ type=str,
311
+ default="PixArt-alpha/PixArt-XL-2-1024-MS",
312
+ help="Local path or model identifier for both the tokenizer and text encoder. Defaults to pretrained model on Hugging Face.",
313
+ )
314
+
315
+ # Conditioning arguments
316
+ parser.add_argument(
317
+ "--conditioning_media_paths",
318
+ type=str,
319
+ nargs="*",
320
+ help="List of paths to conditioning media (images or videos). Each path will be used as a conditioning item.",
321
+ )
322
+ parser.add_argument(
323
+ "--conditioning_strengths",
324
+ type=float,
325
+ nargs="*",
326
+ help="List of conditioning strengths (between 0 and 1) for each conditioning item. Must match the number of conditioning items.",
327
+ )
328
+ parser.add_argument(
329
+ "--conditioning_start_frames",
330
+ type=int,
331
+ nargs="*",
332
+ help="List of frame indices where each conditioning item should be applied. Must match the number of conditioning items.",
333
+ )
334
+ parser.add_argument(
335
+ "--sampler",
336
+ type=str,
337
+ choices=["uniform", "linear-quadratic"],
338
+ default=None,
339
+ help="Sampler to use for noise scheduling. Can be either 'uniform' or 'linear-quadratic'. If not specified, uses the sampler from the checkpoint.",
340
+ )
341
+
342
+ # Prompt enhancement
343
+ parser.add_argument(
344
+ "--prompt_enhancement_words_threshold",
345
+ type=int,
346
+ default=50,
347
+ help="Enable prompt enhancement only if input prompt has fewer words than this threshold. Set to 0 to disable enhancement completely.",
348
+ )
349
+ parser.add_argument(
350
+ "--prompt_enhancer_image_caption_model_name_or_path",
351
+ type=str,
352
+ default="MiaoshouAI/Florence-2-large-PromptGen-v2.0",
353
+ help="Path to the image caption model",
354
+ )
355
+ parser.add_argument(
356
+ "--prompt_enhancer_llm_model_name_or_path",
357
+ type=str,
358
+ default="unsloth/Llama-3.2-3B-Instruct",
359
+ help="Path to the LLM model, default is Llama-3.2-3B-Instruct, but you can use other models like Llama-3.1-8B-Instruct, or other models supported by Hugging Face",
360
+ )
361
+
362
+ args = parser.parse_args()
363
+ logger.warning(f"Running generation with arguments: {args}")
364
+ infer(**vars(args))
365
+
366
+
367
+ def create_ltx_video_pipeline(
368
+ ckpt_path: str,
369
+ precision: str,
370
+ text_encoder_model_name_or_path: str,
371
+ sampler: Optional[str] = None,
372
+ device: Optional[str] = None,
373
+ lowVram: bool = False,
374
+ enhance_prompt: bool = False,
375
+ prompt_enhancer_image_caption_model_name_or_path: Optional[str] = None,
376
+ prompt_enhancer_llm_model_name_or_path: Optional[str] = None,
377
+ ) -> LTXVideoPipeline:
378
+ ckpt_path = Path(ckpt_path)
379
+ assert os.path.exists(
380
+ ckpt_path
381
+ ), f"Ckpt path provided (--ckpt_path) {ckpt_path} does not exist"
382
+ vae = CausalVideoAutoencoder.from_pretrained(ckpt_path)
383
+ transformer = Transformer3DModel.from_pretrained(ckpt_path)
384
+
385
+ # Use constructor if sampler is specified, otherwise use from_pretrained
386
+ if sampler:
387
+ scheduler = RectifiedFlowScheduler(
388
+ sampler=("Uniform" if sampler.lower() == "uniform" else "LinearQuadratic")
389
+ )
390
+ else:
391
+ scheduler = RectifiedFlowScheduler.from_pretrained(ckpt_path)
392
+
393
+ text_encoder = T5EncoderModel.from_pretrained(text_encoder_model_name_or_path, subfolder="text_encoder")
394
+ patchifier = SymmetricPatchifier(patch_size=1)
395
+ tokenizer = T5Tokenizer.from_pretrained(
396
+ text_encoder_model_name_or_path, subfolder="tokenizer"
397
+ )
398
+
399
+ if torch.cuda.is_available() and not lowVram:
400
+ text_encoder = text_encoder.to(device)
401
+ else:
402
+ text_encoder = text_encoder.to("cpu")
403
+ text_encoder = text_encoder.to(dtype=torch.bfloat16, device="cpu")
404
+
405
+ transformer = transformer.to(device)
406
+ vae = vae.to(device)
407
+ # text_encoder = text_encoder.to(device)
408
+
409
+ if enhance_prompt:
410
+ prompt_enhancer_image_caption_model = AutoModelForCausalLM.from_pretrained(
411
+ prompt_enhancer_image_caption_model_name_or_path, trust_remote_code=True
412
+ )
413
+ prompt_enhancer_image_caption_processor = AutoProcessor.from_pretrained(
414
+ prompt_enhancer_image_caption_model_name_or_path, trust_remote_code=True
415
+ )
416
+ prompt_enhancer_llm_model = AutoModelForCausalLM.from_pretrained(
417
+ prompt_enhancer_llm_model_name_or_path,
418
+ torch_dtype="bfloat16",
419
+ )
420
+ prompt_enhancer_llm_tokenizer = AutoTokenizer.from_pretrained(
421
+ prompt_enhancer_llm_model_name_or_path,
422
+ )
423
+ else:
424
+ prompt_enhancer_image_caption_model = None
425
+ prompt_enhancer_image_caption_processor = None
426
+ prompt_enhancer_llm_model = None
427
+ prompt_enhancer_llm_tokenizer = None
428
+
429
+ vae = vae.to(torch.bfloat16)
430
+ if precision == "bfloat16" and transformer.dtype != torch.bfloat16:
431
+ transformer = transformer.to(torch.bfloat16)
432
+ # text_encoder = text_encoder.to(torch.bfloat16)
433
+
434
+ # Use submodels for the pipeline
435
+ submodel_dict = {
436
+ "transformer": transformer,
437
+ "patchifier": patchifier,
438
+ "text_encoder": text_encoder,
439
+ "tokenizer": tokenizer,
440
+ "scheduler": scheduler,
441
+ "vae": vae,
442
+ "prompt_enhancer_image_caption_model": prompt_enhancer_image_caption_model,
443
+ "prompt_enhancer_image_caption_processor": prompt_enhancer_image_caption_processor,
444
+ "prompt_enhancer_llm_model": prompt_enhancer_llm_model,
445
+ "prompt_enhancer_llm_tokenizer": prompt_enhancer_llm_tokenizer,
446
+ }
447
+
448
+ pipeline = LTXVideoPipeline(**submodel_dict)
449
+ if torch.cuda.is_available() and not lowVram:
450
+ pipeline = pipeline.to("cuda")
451
+ return pipeline
452
+
453
+
454
+ def infer(
455
+ ckpt_path: str,
456
+ output_path: Optional[str],
457
+ seed: int,
458
+ num_inference_steps: int,
459
+ num_images_per_prompt: int,
460
+ guidance_scale: float,
461
+ stg_scale: float,
462
+ stg_rescale: float,
463
+ stg_mode: str,
464
+ stg_skip_layers: str,
465
+ image_cond_noise_scale: float,
466
+ height: Optional[int],
467
+ width: Optional[int],
468
+ num_frames: int,
469
+ frame_rate: int,
470
+ precision: str,
471
+ decode_timestep: float,
472
+ decode_noise_scale: float,
473
+ prompt: str,
474
+ negative_prompt: str,
475
+ low_vram: bool,
476
+ offload_to_cpu: bool,
477
+ text_encoder_model_name_or_path: str,
478
+ conditioning_media_paths: Optional[List[str]] = None,
479
+ conditioning_strengths: Optional[List[float]] = None,
480
+ conditioning_start_frames: Optional[List[int]] = None,
481
+ sampler: Optional[str] = None,
482
+ device: Optional[str] = None,
483
+ prompt_enhancement_words_threshold: int = 50,
484
+ prompt_enhancer_image_caption_model_name_or_path: str = "MiaoshouAI/Florence-2-large-PromptGen-v2.0",
485
+ prompt_enhancer_llm_model_name_or_path: str = "unsloth/Llama-3.2-3B-Instruct",
486
+ **kwargs,
487
+ ):
488
+ if kwargs.get("input_image_path", None):
489
+ logger.warning(
490
+ "Please use conditioning_media_paths instead of input_image_path."
491
+ )
492
+ assert not conditioning_media_paths and not conditioning_start_frames
493
+ conditioning_media_paths = [kwargs["input_image_path"]]
494
+ conditioning_start_frames = [0]
495
+
496
+ # Validate conditioning arguments
497
+ if conditioning_media_paths:
498
+ # Use default strengths of 1.0
499
+ if not conditioning_strengths:
500
+ conditioning_strengths = [1.0] * len(conditioning_media_paths)
501
+ if not conditioning_start_frames:
502
+ raise ValueError(
503
+ "If `conditioning_media_paths` is provided, "
504
+ "`conditioning_start_frames` must also be provided"
505
+ )
506
+ if len(conditioning_media_paths) != len(conditioning_strengths) or len(
507
+ conditioning_media_paths
508
+ ) != len(conditioning_start_frames):
509
+ raise ValueError(
510
+ "`conditioning_media_paths`, `conditioning_strengths`, "
511
+ "and `conditioning_start_frames` must have the same length"
512
+ )
513
+ if any(s < 0 or s > 1 for s in conditioning_strengths):
514
+ raise ValueError("All conditioning strengths must be between 0 and 1")
515
+ if any(f < 0 or f >= num_frames for f in conditioning_start_frames):
516
+ raise ValueError(
517
+ f"All conditioning start frames must be between 0 and {num_frames-1}"
518
+ )
519
+
520
+ seed_everething(seed)
521
+ if offload_to_cpu and not torch.cuda.is_available():
522
+ logger.warning(
523
+ "offload_to_cpu is set to True, but offloading will not occur since the model is already running on CPU."
524
+ )
525
+ offload_to_cpu = False
526
+ else:
527
+ offload_to_cpu = offload_to_cpu and get_total_gpu_memory() < 30
528
+
529
+ output_dir = (
530
+ Path(output_path)
531
+ if output_path
532
+ else Path(f"outputs/{datetime.today().strftime('%Y-%m-%d')}")
533
+ )
534
+ output_dir.mkdir(parents=True, exist_ok=True)
535
+
536
+ # Adjust dimensions to be divisible by 32 and num_frames to be (N * 8 + 1)
537
+ height_padded = ((height - 1) // 32 + 1) * 32
538
+ width_padded = ((width - 1) // 32 + 1) * 32
539
+ num_frames_padded = ((num_frames - 2) // 8 + 1) * 8 + 1
540
+
541
+ padding = calculate_padding(height, width, height_padded, width_padded)
542
+
543
+ logger.warning(
544
+ f"Padded dimensions: {height_padded}x{width_padded}x{num_frames_padded}"
545
+ )
546
+
547
+ prompt_word_count = len(prompt.split())
548
+ enhance_prompt = (
549
+ prompt_enhancement_words_threshold > 0
550
+ and prompt_word_count < prompt_enhancement_words_threshold
551
+ )
552
+
553
+ if prompt_enhancement_words_threshold > 0 and not enhance_prompt:
554
+ logger.info(
555
+ f"Prompt has {prompt_word_count} words, which exceeds the threshold of {prompt_enhancement_words_threshold}. Prompt enhancement disabled."
556
+ )
557
+
558
+ pipeline = create_ltx_video_pipeline(
559
+ ckpt_path=ckpt_path,
560
+ precision=precision,
561
+ text_encoder_model_name_or_path=text_encoder_model_name_or_path,
562
+ sampler=sampler,
563
+ device=kwargs.get("device", get_device()),
564
+ lowVram=low_vram,
565
+ enhance_prompt=enhance_prompt,
566
+ prompt_enhancer_image_caption_model_name_or_path=prompt_enhancer_image_caption_model_name_or_path,
567
+ prompt_enhancer_llm_model_name_or_path=prompt_enhancer_llm_model_name_or_path,
568
+ )
569
+
570
+ conditioning_items = (
571
+ prepare_conditioning(
572
+ conditioning_media_paths=conditioning_media_paths,
573
+ conditioning_strengths=conditioning_strengths,
574
+ conditioning_start_frames=conditioning_start_frames,
575
+ height=height,
576
+ width=width,
577
+ num_frames=num_frames,
578
+ padding=padding,
579
+ pipeline=pipeline,
580
+ )
581
+ if conditioning_media_paths
582
+ else None
583
+ )
584
+
585
+ # Set spatiotemporal guidance
586
+ skip_block_list = [int(x.strip()) for x in stg_skip_layers.split(",")]
587
+ if stg_mode.lower() == "stg_av" or stg_mode.lower() == "attention_values":
588
+ skip_layer_strategy = SkipLayerStrategy.AttentionValues
589
+ elif stg_mode.lower() == "stg_as" or stg_mode.lower() == "attention_skip":
590
+ skip_layer_strategy = SkipLayerStrategy.AttentionSkip
591
+ elif stg_mode.lower() == "stg_r" or stg_mode.lower() == "residual":
592
+ skip_layer_strategy = SkipLayerStrategy.Residual
593
+ elif stg_mode.lower() == "stg_t" or stg_mode.lower() == "transformer_block":
594
+ skip_layer_strategy = SkipLayerStrategy.TransformerBlock
595
+ else:
596
+ raise ValueError(f"Invalid spatiotemporal guidance mode: {stg_mode}")
597
+
598
+ # Prepare input for the pipeline
599
+ sample = {
600
+ "prompt": prompt,
601
+ "prompt_attention_mask": None,
602
+ "negative_prompt": negative_prompt,
603
+ "negative_prompt_attention_mask": None,
604
+ }
605
+
606
+ device = device or get_device()
607
+ generator = torch.Generator(device=device).manual_seed(seed)
608
+
609
+ images = pipeline(
610
+ num_inference_steps=num_inference_steps,
611
+ num_images_per_prompt=num_images_per_prompt,
612
+ guidance_scale=guidance_scale,
613
+ skip_layer_strategy=skip_layer_strategy,
614
+ skip_block_list=skip_block_list,
615
+ stg_scale=stg_scale,
616
+ do_rescaling=stg_rescale != 1,
617
+ rescaling_scale=stg_rescale,
618
+ generator=generator,
619
+ output_type="pt",
620
+ callback_on_step_end=None,
621
+ height=height_padded,
622
+ width=width_padded,
623
+ num_frames=num_frames_padded,
624
+ frame_rate=frame_rate,
625
+ **sample,
626
+ conditioning_items=conditioning_items,
627
+ is_video=True,
628
+ vae_per_channel_normalize=True,
629
+ image_cond_noise_scale=image_cond_noise_scale,
630
+ decode_timestep=decode_timestep,
631
+ decode_noise_scale=decode_noise_scale,
632
+ mixed_precision=(precision == "mixed_precision"),
633
+ offload_to_cpu=offload_to_cpu,
634
+ device=device,
635
+ enhance_prompt=enhance_prompt,
636
+ ).images
637
+
638
+ # Crop the padded images to the desired resolution and number of frames
639
+ (pad_left, pad_right, pad_top, pad_bottom) = padding
640
+ pad_bottom = -pad_bottom
641
+ pad_right = -pad_right
642
+ if pad_bottom == 0:
643
+ pad_bottom = images.shape[3]
644
+ if pad_right == 0:
645
+ pad_right = images.shape[4]
646
+ images = images[:, :, :num_frames, pad_top:pad_bottom, pad_left:pad_right]
647
+
648
+ for i in range(images.shape[0]):
649
+ # Gathering from B, C, F, H, W to C, F, H, W and then permuting to F, H, W, C
650
+ video_np = images[i].permute(1, 2, 3, 0).cpu().float().numpy()
651
+ # Unnormalizing images to [0, 255] range
652
+ video_np = (video_np * 255).astype(np.uint8)
653
+ fps = frame_rate
654
+ height, width = video_np.shape[1:3]
655
+ # In case a single image is generated
656
+ if video_np.shape[0] == 1:
657
+ output_filename = get_unique_filename(
658
+ f"image_output_{i}",
659
+ ".png",
660
+ prompt=prompt,
661
+ seed=seed,
662
+ resolution=(height, width, num_frames),
663
+ dir=output_dir,
664
+ )
665
+ imageio.imwrite(output_filename, video_np[0])
666
+ else:
667
+ output_filename = get_unique_filename(
668
+ f"video_output_{i}",
669
+ ".mp4",
670
+ prompt=prompt,
671
+ seed=seed,
672
+ resolution=(height, width, num_frames),
673
+ dir=output_dir,
674
+ )
675
+
676
+ # Write video
677
+ with imageio.get_writer(output_filename, fps=fps) as video:
678
+ for frame in video_np:
679
+ video.append_data(frame)
680
+
681
+ logger.warning(f"Output saved to {output_dir}")
682
+
683
+
684
+ def prepare_conditioning(
685
+ conditioning_media_paths: List[str],
686
+ conditioning_strengths: List[float],
687
+ conditioning_start_frames: List[int],
688
+ height: int,
689
+ width: int,
690
+ num_frames: int,
691
+ padding: tuple[int, int, int, int],
692
+ pipeline: LTXVideoPipeline,
693
+ ) -> Optional[List[ConditioningItem]]:
694
+ """Prepare conditioning items based on input media paths and their parameters.
695
+
696
+ Args:
697
+ conditioning_media_paths: List of paths to conditioning media (images or videos)
698
+ conditioning_strengths: List of conditioning strengths for each media item
699
+ conditioning_start_frames: List of frame indices where each item should be applied
700
+ height: Height of the output frames
701
+ width: Width of the output frames
702
+ num_frames: Number of frames in the output video
703
+ padding: Padding to apply to the frames
704
+ pipeline: LTXVideoPipeline object used for condition video trimming
705
+
706
+ Returns:
707
+ A list of ConditioningItem objects.
708
+ """
709
+ conditioning_items = []
710
+ for path, strength, start_frame in zip(
711
+ conditioning_media_paths, conditioning_strengths, conditioning_start_frames
712
+ ):
713
+ # Check if the path points to an image or video
714
+ is_video = any(
715
+ path.lower().endswith(ext) for ext in [".mp4", ".avi", ".mov", ".mkv"]
716
+ )
717
+
718
+ if is_video:
719
+ reader = imageio.get_reader(path)
720
+ orig_num_input_frames = reader.count_frames()
721
+ num_input_frames = pipeline.trim_conditioning_sequence(
722
+ start_frame, orig_num_input_frames, num_frames
723
+ )
724
+ if num_input_frames < orig_num_input_frames:
725
+ logger.warning(
726
+ f"Trimming conditioning video {path} from {orig_num_input_frames} to {num_input_frames} frames."
727
+ )
728
+
729
+ # Read and preprocess the relevant frames from the video file.
730
+ frames = []
731
+ for i in range(num_input_frames):
732
+ frame = Image.fromarray(reader.get_data(i))
733
+ frame_tensor = load_image_to_tensor_with_resize_and_crop(
734
+ frame, height, width
735
+ )
736
+ frame_tensor = torch.nn.functional.pad(frame_tensor, padding)
737
+ frames.append(frame_tensor)
738
+ reader.close()
739
+
740
+ # Stack frames along the temporal dimension
741
+ video_tensor = torch.cat(frames, dim=2)
742
+ conditioning_items.append(
743
+ ConditioningItem(video_tensor, start_frame, strength)
744
+ )
745
+ else: # Input image
746
+ frame_tensor = load_image_to_tensor_with_resize_and_crop(
747
+ path, height, width
748
+ )
749
+ frame_tensor = torch.nn.functional.pad(frame_tensor, padding)
750
+ conditioning_items.append(
751
+ ConditioningItem(frame_tensor, start_frame, strength)
752
+ )
753
+
754
+ return conditioning_items
755
+
756
+
757
+ if __name__ == "__main__":
758
+ main()
LTX-Video/ltx_video.egg-info/PKG-INFO ADDED
@@ -0,0 +1,305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: ltx-video
3
+ Version: 0.1.2
4
+ Summary: A package for LTX-Video model
5
+ Author-email: Sapir Weissbuch <sapir@lightricks.com>
6
+ Classifier: Programming Language :: Python :: 3
7
+ Classifier: Operating System :: OS Independent
8
+ Requires-Python: >=3.10
9
+ Description-Content-Type: text/markdown
10
+ License-File: LICENSE
11
+ Requires-Dist: torch>=2.1.0
12
+ Requires-Dist: diffusers>=0.28.2
13
+ Requires-Dist: transformers>=4.47.2
14
+ Requires-Dist: sentencepiece>=0.1.96
15
+ Requires-Dist: huggingface-hub~=0.25.2
16
+ Requires-Dist: einops
17
+ Requires-Dist: timm
18
+ Provides-Extra: inference-script
19
+ Requires-Dist: accelerate; extra == "inference-script"
20
+ Requires-Dist: matplotlib; extra == "inference-script"
21
+ Requires-Dist: imageio[ffmpeg]; extra == "inference-script"
22
+ Provides-Extra: test
23
+ Requires-Dist: pytest; extra == "test"
24
+ Dynamic: license-file
25
+
26
+ <div align="center">
27
+
28
+ # LTX-Video
29
+
30
+ This is the official repository for LTX-Video.
31
+
32
+ [Website](https://www.lightricks.com/ltxv) |
33
+ [Model](https://huggingface.co/Lightricks/LTX-Video) |
34
+ [Demo](https://app.ltx.studio/ltx-video) |
35
+ [Paper](https://arxiv.org/abs/2501.00103)
36
+
37
+ </div>
38
+
39
+ ## Table of Contents
40
+
41
+ - [Introduction](#introduction)
42
+ - [What's new](#news)
43
+ - [Quick Start Guide](#quick-start-guide)
44
+ - [Online demo](#online-demo)
45
+ - [Run locally](#run-locally)
46
+ - [Installation](#installation)
47
+ - [Inference](#inference)
48
+ - [ComfyUI Integration](#comfyui-integration)
49
+ - [Diffusers Integration](#diffusers-integration)
50
+ - [Model User Guide](#model-user-guide)
51
+ - [Community Contribution](#community-contribution)
52
+ - [Training](#trining)
53
+ - [Join Us!](#join-us)
54
+ - [Acknowledgement](#acknowledgement)
55
+
56
+ # Introduction
57
+
58
+ LTX-Video is the first DiT-based video generation model that can generate high-quality videos in *real-time*.
59
+ It can generate 24 FPS videos at 768x512 resolution, faster than it takes to watch them.
60
+ The model is trained on a large-scale dataset of diverse videos and can generate high-resolution videos
61
+ with realistic and diverse content.
62
+
63
+ The model supports text-to-image, image-to-video, keyframe-based animation, video extension (both forward and backward), video-to-video transformations, and any combination of these features.
64
+
65
+ | | | | |
66
+ |:---:|:---:|:---:|:---:|
67
+ | ![example1](./docs/_static/ltx-video_example_00001.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with long brown hair and light skin smiles at another woman...</summary>A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage.</details> | ![example2](./docs/_static/ltx-video_example_00002.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman walks away from a white Jeep parked on a city street at night...</summary>A woman walks away from a white Jeep parked on a city street at night, then ascends a staircase and knocks on a door. The woman, wearing a dark jacket and jeans, walks away from the Jeep parked on the left side of the street, her back to the camera; she walks at a steady pace, her arms swinging slightly by her sides; the street is dimly lit, with streetlights casting pools of light on the wet pavement; a man in a dark jacket and jeans walks past the Jeep in the opposite direction; the camera follows the woman from behind as she walks up a set of stairs towards a building with a green door; she reaches the top of the stairs and turns left, continuing to walk towards the building; she reaches the door and knocks on it with her right hand; the camera remains stationary, focused on the doorway; the scene is captured in real-life footage.</details> | ![example3](./docs/_static/ltx-video_example_00003.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with blonde hair styled up, wearing a black dress...</summary>A woman with blonde hair styled up, wearing a black dress with sequins and pearl earrings, looks down with a sad expression on her face. The camera remains stationary, focused on the woman's face. The lighting is dim, casting soft shadows on her face. The scene appears to be from a movie or TV show.</details> | ![example4](./docs/_static/ltx-video_example_00004.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The camera pans over a snow-covered mountain range...</summary>The camera pans over a snow-covered mountain range, revealing a vast expanse of snow-capped peaks and valleys.The mountains are covered in a thick layer of snow, with some areas appearing almost white while others have a slightly darker, almost grayish hue. The peaks are jagged and irregular, with some rising sharply into the sky while others are more rounded. The valleys are deep and narrow, with steep slopes that are also covered in snow. The trees in the foreground are mostly bare, with only a few leaves remaining on their branches. The sky is overcast, with thick clouds obscuring the sun. The overall impression is one of peace and tranquility, with the snow-covered mountains standing as a testament to the power and beauty of nature.</details> |
68
+ | ![example5](./docs/_static/ltx-video_example_00005.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with light skin, wearing a blue jacket and a black hat...</summary>A woman with light skin, wearing a blue jacket and a black hat with a veil, looks down and to her right, then back up as she speaks; she has brown hair styled in an updo, light brown eyebrows, and is wearing a white collared shirt under her jacket; the camera remains stationary on her face as she speaks; the background is out of focus, but shows trees and people in period clothing; the scene is captured in real-life footage.</details> | ![example6](./docs/_static/ltx-video_example_00006.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man in a dimly lit room talks on a vintage telephone...</summary>A man in a dimly lit room talks on a vintage telephone, hangs up, and looks down with a sad expression. He holds the black rotary phone to his right ear with his right hand, his left hand holding a rocks glass with amber liquid. He wears a brown suit jacket over a white shirt, and a gold ring on his left ring finger. His short hair is neatly combed, and he has light skin with visible wrinkles around his eyes. The camera remains stationary, focused on his face and upper body. The room is dark, lit only by a warm light source off-screen to the left, casting shadows on the wall behind him. The scene appears to be from a movie.</details> | ![example7](./docs/_static/ltx-video_example_00007.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A prison guard unlocks and opens a cell door...</summary>A prison guard unlocks and opens a cell door to reveal a young man sitting at a table with a woman. The guard, wearing a dark blue uniform with a badge on his left chest, unlocks the cell door with a key held in his right hand and pulls it open; he has short brown hair, light skin, and a neutral expression. The young man, wearing a black and white striped shirt, sits at a table covered with a white tablecloth, facing the woman; he has short brown hair, light skin, and a neutral expression. The woman, wearing a dark blue shirt, sits opposite the young man, her face turned towards him; she has short blonde hair and light skin. The camera remains stationary, capturing the scene from a medium distance, positioned slightly to the right of the guard. The room is dimly lit, with a single light fixture illuminating the table and the two figures. The walls are made of large, grey concrete blocks, and a metal door is visible in the background. The scene is captured in real-life footage.</details> | ![example8](./docs/_static/ltx-video_example_00008.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with blood on her face and a white tank top...</summary>A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.</details> |
69
+ | ![example9](./docs/_static/ltx-video_example_00009.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man with graying hair, a beard, and a gray shirt...</summary>A man with graying hair, a beard, and a gray shirt looks down and to his right, then turns his head to the left. The camera angle is a close-up, focused on the man's face. The lighting is dim, with a greenish tint. The scene appears to be real-life footage. Step</details> | ![example10](./docs/_static/ltx-video_example_00010.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A clear, turquoise river flows through a rocky canyon...</summary>A clear, turquoise river flows through a rocky canyon, cascading over a small waterfall and forming a pool of water at the bottom.The river is the main focus of the scene, with its clear water reflecting the surrounding trees and rocks. The canyon walls are steep and rocky, with some vegetation growing on them. The trees are mostly pine trees, with their green needles contrasting with the brown and gray rocks. The overall tone of the scene is one of peace and tranquility.</details> | ![example11](./docs/_static/ltx-video_example_00011.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man in a suit enters a room and speaks to two women...</summary>A man in a suit enters a room and speaks to two women sitting on a couch. The man, wearing a dark suit with a gold tie, enters the room from the left and walks towards the center of the frame. He has short gray hair, light skin, and a serious expression. He places his right hand on the back of a chair as he approaches the couch. Two women are seated on a light-colored couch in the background. The woman on the left wears a light blue sweater and has short blonde hair. The woman on the right wears a white sweater and has short blonde hair. The camera remains stationary, focusing on the man as he enters the room. The room is brightly lit, with warm tones reflecting off the walls and furniture. The scene appears to be from a film or television show.</details> | ![example12](./docs/_static/ltx-video_example_00012.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The waves crash against the jagged rocks of the shoreline...</summary>The waves crash against the jagged rocks of the shoreline, sending spray high into the air.The rocks are a dark gray color, with sharp edges and deep crevices. The water is a clear blue-green, with white foam where the waves break against the rocks. The sky is a light gray, with a few white clouds dotting the horizon.</details> |
70
+ | ![example13](./docs/_static/ltx-video_example_00013.gif)<br><details style="max-width: 300px; margin: auto;"><summary>The camera pans across a cityscape of tall buildings...</summary>The camera pans across a cityscape of tall buildings with a circular building in the center. The camera moves from left to right, showing the tops of the buildings and the circular building in the center. The buildings are various shades of gray and white, and the circular building has a green roof. The camera angle is high, looking down at the city. The lighting is bright, with the sun shining from the upper left, casting shadows from the buildings. The scene is computer-generated imagery.</details> | ![example14](./docs/_static/ltx-video_example_00014.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A man walks towards a window, looks out, and then turns around...</summary>A man walks towards a window, looks out, and then turns around. He has short, dark hair, dark skin, and is wearing a brown coat over a red and gray scarf. He walks from left to right towards a window, his gaze fixed on something outside. The camera follows him from behind at a medium distance. The room is brightly lit, with white walls and a large window covered by a white curtain. As he approaches the window, he turns his head slightly to the left, then back to the right. He then turns his entire body to the right, facing the window. The camera remains stationary as he stands in front of the window. The scene is captured in real-life footage.</details> | ![example15](./docs/_static/ltx-video_example_00015.gif)<br><details style="max-width: 300px; margin: auto;"><summary>Two police officers in dark blue uniforms and matching hats...</summary>Two police officers in dark blue uniforms and matching hats enter a dimly lit room through a doorway on the left side of the frame. The first officer, with short brown hair and a mustache, steps inside first, followed by his partner, who has a shaved head and a goatee. Both officers have serious expressions and maintain a steady pace as they move deeper into the room. The camera remains stationary, capturing them from a slightly low angle as they enter. The room has exposed brick walls and a corrugated metal ceiling, with a barred window visible in the background. The lighting is low-key, casting shadows on the officers' faces and emphasizing the grim atmosphere. The scene appears to be from a film or television show.</details> | ![example16](./docs/_static/ltx-video_example_00016.gif)<br><details style="max-width: 300px; margin: auto;"><summary>A woman with short brown hair, wearing a maroon sleeveless top...</summary>A woman with short brown hair, wearing a maroon sleeveless top and a silver necklace, walks through a room while talking, then a woman with pink hair and a white shirt appears in the doorway and yells. The first woman walks from left to right, her expression serious; she has light skin and her eyebrows are slightly furrowed. The second woman stands in the doorway, her mouth open in a yell; she has light skin and her eyes are wide. The room is dimly lit, with a bookshelf visible in the background. The camera follows the first woman as she walks, then cuts to a close-up of the second woman's face. The scene is captured in real-life footage.</details> |
71
+
72
+ # News
73
+
74
+ ## March, 5th, 2025: New checkpoint v0.9.5
75
+ - New license for commercial use ([OpenRail-M](https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.5.license.txt))
76
+ - Release a new checkpoint v0.9.5 with improved quality
77
+ - Support keyframes and video extension
78
+ - Support higher resolutions
79
+ - Improved prompt understanding
80
+ - Improved VAE
81
+ - New online web app in [LTX-Studio](https://app.ltx.studio/ltx-video)
82
+ - Automatic prompt enhancement
83
+
84
+ ## February, 20th, 2025: More inference options
85
+ - Improve STG (Spatiotemporal Guidance) for LTX-Video
86
+ - Support MPS on macOS with PyTorch 2.3.0
87
+ - Add support for 8-bit model, LTX-VideoQ8
88
+ - Add TeaCache for LTX-Video
89
+ - Add [ComfyUI-LTXTricks](#comfyui-integration)
90
+ - Add Diffusion-Pipe
91
+
92
+ ## December 31st, 2024: Research paper
93
+ - Release the [research paper](https://arxiv.org/abs/2501.00103)
94
+
95
+ ## December 20th, 2024: New checkpoint v0.9.1
96
+ - Release a new checkpoint v0.9.1 with improved quality
97
+ - Support for STG / PAG
98
+ - Support loading checkpoints of LTX-Video in Diffusers format (conversion is done on-the-fly)
99
+ - Support offloading unused parts to CPU
100
+ - Support the new timestep-conditioned VAE decoder
101
+ - Reference contributions from the community in the readme file
102
+ - Relax transformers dependency
103
+
104
+ ## November 21th, 2024: Initial release v0.9.0
105
+ - Initial release of LTX-Video
106
+ - Support text-to-video and image-to-video generation
107
+
108
+ # Quick Start Guide
109
+
110
+ ## Online inference
111
+ The model is accessible right away via the following links:
112
+ - [LTX-Studio image-to-video](https://app.ltx.studio/ltx-video)
113
+ - [Fal.ai text-to-video](https://fal.ai/models/fal-ai/ltx-video)
114
+ - [Fal.ai image-to-video](https://fal.ai/models/fal-ai/ltx-video/image-to-video)
115
+ - [Replicate text-to-video and image-to-video](https://replicate.com/lightricks/ltx-video)
116
+
117
+ ## Run locally
118
+
119
+ ### Installation
120
+ The codebase was tested with Python 3.10.5, CUDA version 12.2, and supports PyTorch >= 2.1.2.
121
+ On macos, MPS was tested with PyTorch 2.3.0, and should support PyTorch == 2.3 or >= 2.6.
122
+
123
+ ```bash
124
+ git clone https://github.com/Lightricks/LTX-Video.git
125
+ cd LTX-Video
126
+
127
+ # create env
128
+ python -m venv env
129
+ source env/bin/activate
130
+ python -m pip install -e .\[inference-script\]
131
+ ```
132
+
133
+ Then, download the model from [Hugging Face](https://huggingface.co/Lightricks/LTX-Video)
134
+
135
+ ```python
136
+ from huggingface_hub import hf_hub_download
137
+
138
+ model_dir = 'MODEL_DIR' # The local directory to save downloaded checkpoint
139
+ hf_hub_download(repo_id="Lightricks/LTX-Video", filename="ltx-video-2b-v0.9.5.safetensors", local_dir=model_dir, local_dir_use_symlinks=False, repo_type='model')
140
+ ```
141
+
142
+ ### Inference
143
+
144
+ To use our model, please follow the inference code in [inference.py](./inference.py):
145
+
146
+ #### For text-to-video generation:
147
+
148
+ ```bash
149
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
150
+ ```
151
+
152
+ #### For image-to-video generation:
153
+
154
+ ```bash
155
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths IMAGE_PATH --conditioning_start_frames 0 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
156
+ ```
157
+
158
+ #### Extending a video:
159
+
160
+ ๐Ÿ“ **Note:** Input video segments must contain a multiple of 8 frames plus 1 (e.g., 9, 17, 25, etc.), and the target frame number should be a multiple of 8.
161
+
162
+
163
+ ```bash
164
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths VIDEO_PATH --conditioning_start_frames START_FRAME --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
165
+ ```
166
+
167
+ #### For video generation with multiple conditions:
168
+
169
+ You can now generate a video conditioned on a set of images and/or short video segments.
170
+ Simply provide a list of paths to the images or video segments you want to condition on, along with their target frame numbers in the generated video. You can also specify the conditioning strength for each item (default: 1.0).
171
+
172
+ ```bash
173
+ python inference.py --ckpt_path 'PATH' --prompt "PROMPT" --conditioning_media_paths IMAGE_OR_VIDEO_PATH_1 IMAGE_OR_VIDEO_PATH_2 --conditioning_start_frames TARGET_FRAME_1 TARGET_FRAME_2 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
174
+ ```
175
+
176
+ ## ComfyUI Integration
177
+ To use our model with ComfyUI, please follow the instructions at [https://github.com/Lightricks/ComfyUI-LTXVideo/](https://github.com/Lightricks/ComfyUI-LTXVideo/).
178
+
179
+ ## Diffusers Integration
180
+ To use our model with the Diffusers Python library, check out the [official documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
181
+
182
+ Diffusers also support an 8-bit version of LTX-Video, [see details below](#ltx-videoq8)
183
+
184
+ # Model User Guide
185
+
186
+ ## ๐Ÿ“ Prompt Engineering
187
+
188
+ When writing prompts, focus on detailed, chronological descriptions of actions and scenes. Include specific movements, appearances, camera angles, and environmental details - all in a single flowing paragraph. Start directly with the action, and keep descriptions literal and precise. Think like a cinematographer describing a shot list. Keep within 200 words. For best results, build your prompts using this structure:
189
+
190
+ * Start with main action in a single sentence
191
+ * Add specific details about movements and gestures
192
+ * Describe character/object appearances precisely
193
+ * Include background and environment details
194
+ * Specify camera angles and movements
195
+ * Describe lighting and colors
196
+ * Note any changes or sudden events
197
+ * See [examples](#introduction) for more inspiration.
198
+
199
+ ### Automatic Prompt Enhancement
200
+ When using `inference.py`, shorts prompts (below `prompt_enhancement_words_threshold` words) are automatically enhanced by a language model. This is supported with text-to-video and image-to-video (first-frame conditioning).
201
+
202
+ When using `LTXVideoPipeline` directly, you can enable prompt enhancement by setting `enhance_prompt=True`.
203
+
204
+ ## ๐ŸŽฎ Parameter Guide
205
+
206
+ * Resolution Preset: Higher resolutions for detailed scenes, lower for faster generation and simpler scenes. The model works on resolutions that are divisible by 32 and number of frames that are divisible by 8 + 1 (e.g. 257). In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input will be padded with -1 and then cropped to the desired resolution and number of frames. The model works best on resolutions under 720 x 1280 and number of frames below 257
207
+ * Seed: Save seed values to recreate specific styles or compositions you like
208
+ * Guidance Scale: 3-3.5 are the recommended values
209
+ * Inference Steps: More steps (40+) for quality, fewer steps (20-30) for speed
210
+
211
+ ๐Ÿ“ For advanced parameters usage, please see `python inference.py --help`
212
+
213
+ ## Community Contribution
214
+
215
+ ### ComfyUI-LTXTricks ๐Ÿ› ๏ธ
216
+
217
+ A community project providing additional nodes for enhanced control over the LTX Video model. It includes implementations of advanced techniques like RF-Inversion, RF-Edit, FlowEdit, and more. These nodes enable workflows such as Image and Video to Video (I+V2V), enhanced sampling via Spatiotemporal Skip Guidance (STG), and interpolation with precise frame settings.
218
+
219
+ - **Repository:** [ComfyUI-LTXTricks](https://github.com/logtd/ComfyUI-LTXTricks)
220
+ - **Features:**
221
+ - ๐Ÿ”„ **RF-Inversion:** Implements [RF-Inversion](https://rf-inversion.github.io/) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_inversion.json).
222
+ - โœ‚๏ธ **RF-Edit:** Implements [RF-Solver-Edit](https://github.com/wangjiangshan0725/RF-Solver-Edit) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_rf_edit.json).
223
+ - ๐ŸŒŠ **FlowEdit:** Implements [FlowEdit](https://github.com/fallenshock/FlowEdit) with an [example workflow here](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_flow_edit.json).
224
+ - ๐ŸŽฅ **I+V2V:** Enables Video to Video with a reference image. [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_iv2v.json).
225
+ - โœจ **Enhance:** Partial implementation of [STGuidance](https://junhahyung.github.io/STGuidance/). [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltxv_stg.json).
226
+ - ๐Ÿ–ผ๏ธ **Interpolation and Frame Setting:** Nodes for precise control of latents per frame. [Example workflow](https://github.com/logtd/ComfyUI-LTXTricks/blob/main/example_workflows/example_ltx_interpolation.json).
227
+
228
+
229
+ ### LTX-VideoQ8 ๐ŸŽฑ <a id="ltx-videoq8"></a>
230
+
231
+ **LTX-VideoQ8** is an 8-bit optimized version of [LTX-Video](https://github.com/Lightricks/LTX-Video), designed for faster performance on NVIDIA ADA GPUs.
232
+
233
+ - **Repository:** [LTX-VideoQ8](https://github.com/KONAKONA666/LTX-Video)
234
+ - **Features:**
235
+ - ๐Ÿš€ Up to 3X speed-up with no accuracy loss
236
+ - ๐ŸŽฅ Generate 720x480x121 videos in under a minute on RTX 4060 (8GB VRAM)
237
+ - ๐Ÿ› ๏ธ Fine-tune 2B transformer models with precalculated latents
238
+ - **Community Discussion:** [Reddit Thread](https://www.reddit.com/r/StableDiffusion/comments/1h79ks2/fast_ltx_video_on_rtx_4060_and_other_ada_gpus/)
239
+ - **Diffusers integration:** A diffusers integration for the 8-bit model is already out! [Details here](https://github.com/sayakpaul/q8-ltx-video)
240
+
241
+
242
+ ### TeaCache for LTX-Video ๐Ÿต <a id="TeaCache"></a>
243
+
244
+ **TeaCache** is a training-free caching approach that leverages timestep differences across model outputs to accelerate LTX-Video inference by up to 2x without significant visual quality degradation.
245
+
246
+ - **Repository:** [TeaCache4LTX-Video](https://github.com/ali-vilab/TeaCache/tree/main/TeaCache4LTX-Video)
247
+ - **Features:**
248
+ - ๐Ÿš€ Speeds up LTX-Video inference.
249
+ - ๐Ÿ“Š Adjustable trade-offs between speed (up to 2x) and visual quality using configurable parameters.
250
+ - ๐Ÿ› ๏ธ No retraining required: Works directly with existing models.
251
+
252
+ ### Your Contribution
253
+
254
+ ...is welcome! If you have a project or tool that integrates with LTX-Video,
255
+ please let us know by opening an issue or pull request.
256
+
257
+ # Training
258
+
259
+ ## Diffusers
260
+
261
+ Diffusers implemented [LoRA support](https://github.com/huggingface/diffusers/pull/10228),
262
+ with a training script for fine-tuning.
263
+ More information and training script in
264
+ [finetrainers](https://github.com/a-r-r-o-w/finetrainers?tab=readme-ov-file#training).
265
+
266
+ ## Diffusion-Pipe
267
+
268
+ An experimental training framework with pipeline parallelism, enabling fine-tuning of large models like **LTX-Video** across multiple GPUs.
269
+
270
+ - **Repository:** [Diffusion-Pipe](https://github.com/tdrussell/diffusion-pipe)
271
+ - **Features:**
272
+ - ๐Ÿ› ๏ธ Full fine-tune support for LTX-Video using LoRA
273
+ - ๐Ÿ“Š Useful metrics logged to Tensorboard
274
+ - ๐Ÿ”„ Training state checkpointing and resumption
275
+ - โšก Efficient pre-caching of latents and text embeddings for multi-GPU setups
276
+
277
+
278
+ # Join Us ๐Ÿš€
279
+
280
+ Want to work on cutting-edge AI research and make a real impact on millions of users worldwide?
281
+
282
+ At **Lightricks**, an AI-first company, we're revolutionizing how visual content is created.
283
+
284
+ If you are passionate about AI, computer vision, and video generation, we would love to hear from you!
285
+
286
+ Please visit our [careers page](https://careers.lightricks.com/careers?query=&office=all&department=R%26D) for more information.
287
+
288
+ # Acknowledgement
289
+
290
+ We are grateful for the following awesome projects when implementing LTX-Video:
291
+ * [DiT](https://github.com/facebookresearch/DiT) and [PixArt-alpha](https://github.com/PixArt-alpha/PixArt-alpha): vision transformers for image generation.
292
+
293
+
294
+ ## Citation
295
+
296
+ ๐Ÿ“„ Our tech report is out! If you find our work helpful, please โญ๏ธ star the repository and cite our paper.
297
+
298
+ ```
299
+ @article{HaCohen2024LTXVideo,
300
+ title={LTX-Video: Realtime Video Latent Diffusion},
301
+ author={HaCohen, Yoav and Chiprut, Nisan and Brazowski, Benny and Shalem, Daniel and Moshe, Dudu and Richardson, Eitan and Levin, Eran and Shiran, Guy and Zabari, Nir and Gordon, Ori and Panet, Poriya and Weissbuch, Sapir and Kulikov, Victor and Bitterman, Yaki and Melumian, Zeev and Bibi, Ofir},
302
+ journal={arXiv preprint arXiv:2501.00103},
303
+ year={2024}
304
+ }
305
+ ```