Kijai/LTXV2_comfy · Works well

Jan 8

•

Not sure if its imagination or not, but seems to work a bit easier on the computer when split into more "regular" models.
And slightly more familiar workflow with that. Thanks for that ;-)

(i didnt render the video in full res, just made for fun ;-))

natalie5

Jan 8

Not sure if its imagination or not, but seems to work a bit easier on the computer when split into more "regular" models.
And slightly more familiar workflow with that. Thanks for that ;-)

(i didnt render the video in full res, just made for fun ;-))

Yeah many people on a discord server said it works better.

merkan39

Jan 8

•

edited Jan 8

I took WF from the video and got an error on audio vae. What could be the error? Both vae files are located in the vae folder.

‘VAE’ object has no attribute 'latent_frequency_bins'

Kijai

Owner Jan 8

I took WF from the video and got an error on audio vae. What could be the error? Both vae files are located in the vae folder.

‘VAE’ object has no attribute 'latent_frequency_bins'

KJNodes probably not updated, I had to add some code to allow loading the audio VAE with it.

Rudra-ai

Jan 8

@RuneXX you are here also, just want to say we won

itsmearun2021

Jan 8

•

edited Jan 8

can someone share a working workflow please?

im also getting error -> "‘VAE’ object has no attribute 'latent_frequency_bins'". tried with load vae node also, but still getting different error

"raise RuntimeError("ERROR: VAE is invalid: None\n\nIf the VAE is from a checkpoint loader node your checkpoint does not contain a valid VAE.")
RuntimeError: ERROR: VAE is invalid: None"

natalie5

Jan 8

can someone share a working workflow please?

im also getting error -> "‘VAE’ object has no attribute 'latent_frequency_bins'". tried with load vae node also, but still getting different error

"raise RuntimeError("ERROR: VAE is invalid: None\n\nIf the VAE is from a checkpoint loader node your checkpoint does not contain a valid VAE.")
RuntimeError: ERROR: VAE is invalid: None"

Currently for the audio VAE you can use the "LTXV Audio VAE loader" node, make sure to keep the audio vae in the checkpoints folder for now, soon after some days the normal vae loader will work

itsmearun2021

Jan 8

can someone share a working workflow please?

im also getting error -> "‘VAE’ object has no attribute 'latent_frequency_bins'". tried with load vae node also, but still getting different error

"raise RuntimeError("ERROR: VAE is invalid: None\n\nIf the VAE is from a checkpoint loader node your checkpoint does not contain a valid VAE.")
RuntimeError: ERROR: VAE is invalid: None"

Currently for the audio VAE you can use the "LTXV Audio VAE loader" node, make sure to keep the audio vae in the checkpoints folder for now, soon after some days the normal vae loader will work

Awesome. it works fine. Thanks

RuneXX

Jan 8

@RuneXX you are here also, just want to say we won

gotta checkout whats going on ;-) hehe
and yes, works pretty well and so fast. Pretty impressive ;-)

(and i thought i had no chance on a 3090, but works like a charm ;-)

RuneXX

Jan 8

Also curious whats cooking at KJNodes ;-) Saw some LTX related stuff.
We'll wait and see ;-)

onoaxxx

Jan 8

This comment has been hidden

richiepic

Jan 8

Update KJNodes even though there is no update. Solved the Audio VAE problem.

Shiny2480

Jan 8

Works great!. Where's the non-distilled full version of the workflow?

RuneXX

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

Ronaldonizuka

Jan 9

Hi, non one can share a simple working worflow ?
Thks in advance.

Mispu

Jan 9

Hi, non one can share a simple working worflow ?
Thks in advance.

Download one of the videos and drag and drop the video into ComfyUI. The workflows are embedded in the videos. Enjoy!

Shiny2480

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

Awesome, works great too.

Liquidmind111

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

natalie5

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

Liquidmind111

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

no no, my question is that IF using the NON distilled version the MAIN MODEL, if i nEED to use the DESITILLED LORA so the MAIN model will work? or not?

natalie5

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

no no, my question is that IF using the NON distilled version the MAIN MODEL, if i nEED to use the DESITILLED LORA so the MAIN model will work? or not?

Yes it will work, using the distill lora on the dev version will basically turn the dev model into distilled model during inference.

Liquidmind111

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

no no, my question is that IF using the NON distilled version the MAIN MODEL, if i nEED to use the DESITILLED LORA so the MAIN model will work? or not?

Yes it will work, using the distill lora on the dev version will basically turn the dev model into distilled model during inference.

but WE have also the distilled main model. the point of using the NON distilled model is to make Better quality at 20 steps..... doesnt make sense, in that case i prefer to use the distilled model with NO LORA, im lost now.....

natalie5

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

no no, my question is that IF using the NON distilled version the MAIN MODEL, if i nEED to use the DESITILLED LORA so the MAIN model will work? or not?

Yes it will work, using the distill lora on the dev version will basically turn the dev model into distilled model during inference.

but WE have also the distilled main model. the point of using the NON distilled model is to make Better quality at 20 steps..... doesnt make sense, in that case i prefer to use the distilled model with NO LORA, im lost now.....

Just use the distilled lora on the dev model if you want to generate fast. If you want higher quality, remove the lora.

Liquidmind111

Jan 9

Works great!. Where's the non-distilled full version of the workflow?

with the DISTILLED version, i got them working MOST of the time, but with the NON distilled versions, it is NOT working, do i need by obligation to use the LORA in this case?

Just use the lora since you save on disk space as well, for distillations I would always recommend loras

no no, my question is that IF using the NON distilled version the MAIN MODEL, if i nEED to use the DESITILLED LORA so the MAIN model will work? or not?

Yes it will work, using the distill lora on the dev version will basically turn the dev model into distilled model during inference.

but WE have also the distilled main model. the point of using the NON distilled model is to make Better quality at 20 steps..... doesnt make sense, in that case i prefer to use the distilled model with NO LORA, im lost now.....

Just use the distilled lora on the dev model if you want to generate fast. If you want higher quality, remove the lora.

ah now makes sense, but is not an OBLIGATION to use it to get a result, it is just optional then..... ok thanks.

Liquidmind111

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

natalie5

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

Something might be bugged in your WF https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v_distilled.json try this WF

Liquidmind111

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

Something might be bugged in your WF https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v_distilled.json try this WF

but i am using the same workflow are RUNE XX, the one that made the video.... does not makes sense..... here look again...... and also, you are sharing a DISTILLED workflow...... i want a NON distilled WF in which i can use KIJAI models.... do you have one NON distilled?

natalie5

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

Something might be bugged in your WF https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v_distilled.json try this WF

but i am using the same workflow are RUNE XX, the one that made the video.... does not makes sense..... here look again...... and also, you are sharing a DISTILLED workflow...... i want a NON distilled WF in which i can use KIJAI models.... do you have one NON distilled?

https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v.json

Liquidmind111

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

Something might be bugged in your WF https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v_distilled.json try this WF

but i am using the same workflow are RUNE XX, the one that made the video.... does not makes sense..... here look again...... and also, you are sharing a DISTILLED workflow...... i want a NON distilled WF in which i can use KIJAI models.... do you have one NON distilled?

https://github.com/Comfy-Org/workflow_templates/blob/main/templates/video_ltx2_i2v.json

yes, those are the One from COMFY TEMPLATES, then i need to change MANY nodes to the KIJAI NODES, thats fine.... thanks for replying..... those are the WF i am using since i mentioned the errors, no harm, have a great day.....

RuneXX

Jan 9

•

edited Jan 9

You could also try these workflows, and simply swap out model loading with KJ model loader nodes exactly same way as you have in your already workflows.
https://github.com/Lightricks/ComfyUI-LTXVideo/tree/master/example_workflows

I had quite nice results with these, might work better. But the native also works for me.

(if you dont already have ComfyUI-LTXVideo nodes, you need to add that first if you want to try those workflows)

RuneXX

Jan 9

A little GGUF test run, works great ;-) might help those that get OOM on lower ram systems

NB! this support is not out yet, but will be soon https://github.com/city96/ComfyUI-GGUF/pull/399

richiepic

Jan 9

Sorry guys, but the Kijai version is very bad.

Back to the template fp8 version with the same prompt and seed:

Liquidmind111

Jan 9

Sorry guys, but the Kijai version is very bad.

Back to the template fp8 version with the same prompt and seed:

but you are using KJ nodes also..... and combine with the NATIVE nodes? WTH? lol but nice result.......

bandifiu

Jan 9

Sorry guys, but the Kijai version is very bad.

Back to the template fp8 version with the same prompt and seed:

but you are using KJ nodes also..... and combine with the NATIVE nodes? WTH? lol but nice result.......

First one is with distilled model meant to be used with CFG 1 and low steps like 6-8, also the scheduler is different.

bandifiu

Jan 9

The reason i am asking if i need the LORA by obligation or not, is because after 20 steps of rendering, i am not getting the results like here - i am leaving EVERYTHING in default as it is when you drag and drop the video into COMFY..... i dont change ANYTHING, just like RUNE XX did, i dont change anything, and THESE ARE MY RESULTS........

very rare then,,,, using the same seed as RUNE XX

I tried you prompt as T2V without image and it only gives a still image with zooming in effect. I had to mention she is talking and says ... to make it work.

Kijai

Owner Jan 9

Sorry guys, but the Kijai version is very bad.

Because you are using distilled model with CFG there.....

richiepic

Jan 10

Sorry guys, but the Kijai version is very bad.

Because you are using distilled model with CFG there.....

I need to officially say apologies! Kijai version works and indeed good result. CFG 1.0 and Step 4 is working fine. Bloody fast. 23s 121 Frames on 1280x704. The only thing I changed is that I bypassed the 0.5 Upscale node as well as the post-upscale group nodes. Additionally added a Basic Scheduler with beta57. Different, but still good result.

ink117

Jan 11

This comment has been hidden (marked as Resolved)

RuneXX

Jan 11

•

edited Jan 11

For the static zoom image "issue", i think it all comes down to the prompt.
Apparently you should not prompt what you already see, and not like image generation AI

Instead focus on sequence of actions to happen.

basically something like this :

Style first (cinematic, anime etc)
Set the scene (what kind of atmosphere, color tones, lighting, etc)
ACTION (important) - describe the sequence of things to happen - including dialog
Camera (optional) : how should the camera act/move
Audio (optional): background sounds, music etc.

LTX-2 Prompting Tips (from the creators of LTX)
Core Actions: Describe events and actions as they occur over time
Audio: Describe sounds and dialogue needed for the scene
Reference Image: Do not repeat details already present
Consistency: Avoid instructions that do not match the reference image, as this will degrade results

I collected some of the LTX-2 prompt guides and instructions (if you use prompt enhancer) here: https://github.com/kijai/ComfyUI-KJNodes/issues/489#issuecomment-3730593217

Hopefully that helps, its just a theory, but i have not had any static image happen ever since ;-) (it could of course also be due to updated models, but prompting the "correct way" wont hurt)

Sergey1Namaste

Jan 14

Sorry guys, but the Kijai version is very bad.

Because you are using distilled model with CFG there.....

I need to officially say apologies! Kijai version works and indeed good result. CFG 1.0 and Step 4 is working fine. Bloody fast. 23s 121 Frames on 1280x704. The only thing I changed is that I bypassed the 0.5 Upscale node as well as the post-upscale group nodes. Additionally added a Basic Scheduler with beta57. Different, but still good result.

Can you please send your example? I really liked your video.

Sergey1Namaste

Jan 14

•

edited Jan 14

Because you are using distilled model with CFG there.....

Thank you so much for everything you do for us all!

Sergey1Namaste

Jan 14

Why do I get this result from the workflow in the video, even though I don’t change anything?

RuneXX

Jan 14

Why do I get this result from the workflow in the video, even though I don’t change anything?

What model are you using? I saw something similar when testing a super small Q2.
And maybe screenshot your model loaders. Probably something wrong there ;-)

anr2me

Jan 15

•

edited Jan 15

For the static zoom image "issue", i think it all comes down to the prompt.
Apparently you should not prompt what you already see, and not like image generation AI

Instead focus on sequence of actions to happen.

basically something like this :

Style first (cinematic, anime etc)

Set the scene (what kind of atmosphere, color tones, lighting, etc)

ACTION (important) - describe the sequence of things to happen - including dialog

Camera (optional) : how should the camera act/move

Audio (optional): background sounds, music etc.

LTX-2 Prompting Tips (from the creators of LTX)
Core Actions: Describe events and actions as they occur over time
Audio: Describe sounds and dialogue needed for the scene
Reference Image: Do not repeat details already present
Consistency: Avoid instructions that do not match the reference image, as this will degrade results

I collected some of the LTX-2 prompt guides and instructions (if you use prompt enhancer) here: https://github.com/kijai/ComfyUI-KJNodes/issues/489#issuecomment-3730593217

Hopefully that helps, its just a theory, but i have not had any static image happen ever since ;-) (it could of course also be due to updated models, but prompting the "correct way" wont hurt)

Hmm.. this example includes the description of the subjects (which already seen) but seems to be okay 🤔 https://www.reddit.com/r/StableDiffusion/s/kZpMEfkIgq

Edit: looks like that example was T2V instead of I2V 😅 no wonder it needs detailed descriptions.

RuneXX

Jan 15

•

edited Jan 15

Might work. The prompting guide above is from LTX themselves

but if no action, and sequence of action, often end up with a still image.
Basically think like a film creator, and not image creator

In your reddit post it has several prompts with timestamps for what should happen (and in this case, the description to set the scene might help with character consistency. At least when he refers to them by name in the later prompt, you probably need to tell the AI who Woody is first )

Sergey1Namaste

Jan 15

Why do I get this result from the workflow in the video, even though I don’t change anything?

What model are you using? I saw something similar when testing a super small Q2.
And maybe screenshot your model loaders. Probably something wrong there ;-)

Thanks! I noticed I installed the distilled model by mistake. Now it's correct. Everything seems to be working as in the workflow that loaded from the video. I updated the drivers for the 5090 and comfyui. I installed PyTorch version 2.9.1+cu130. The result is better, but the image is blurry at the end and the lips are no longer moving.

Sergey1Namaste

Jan 16

the image is blurry at the end and the lips are no longer moving.

Can someone please tell me what the problem is?

RuneXX

Jan 16

•

edited Jan 16

Try prompt some sequence of actions.
Like "she smiles, then she lifts her hand and brush through her hair. She says "....... ""

Without actions, sometimes its a narrator

BoyoDiffusion

Jan 25

I finally fixed the voices, I am voice to voice mid process , I am not shoving in pre rendered audio.

essence25

Mar 23

Why do we get these weird skin folds. Like the skin is made of leather. Can someone explain why or how to avoid this? Have not seen this with other modes. (LTX-2.3)

RuneXX

Mar 23

Why do we get these weird skin folds. Like the skin is made of leather. Can someone explain why or how to avoid this? Have not seen this with other modes. (LTX-2.3)

Try LTX-2.3. Seems better in that regards ;-)

essence25

Mar 29

If you look at the end of my above comment I'm mentioning LTX-2.3 in parenthesis. So its not something on my end then, it is related to LTX model generally?
If the subjects has exposed skin and bends it looks weird almost like wearing a skin suit. I do see improvement with 2.3 I agree. But not possible to eliminate correct?
Clothes and fully dressed subjects are not affected. Hope it will be eliminated completely in the future builds.

RuneXX

Mar 30

•

edited Mar 30

skin and bends it looks weird almost like wearing a skin suit.

What sampler/scheduler are you using? Try euler and/or lcm. Its often a bit smoother results. While res2_s and other can give that "overbaked" skin looks sometimes
And are you using the "hardcoded" sigmas, or setting your own steps and cfg ?