Integrating IC Lora union control with Long Video Custom Audio loop
Hi,
Thank you for your amazing work. I am attempting to add IC Lora union control to the Long Video Custom Audio loop workflow and the dwpose skeleton was sampled. Eventually the character appears but clearly I am doing something wrong.
Here's what I have done if I remember correctly
- Added the ic-lora loader node with the union control lora
- Added a control image input to the loop subgraph
- Inside the subgraph I scale the control image using ltx_latent_off (from ic lora loader) * 32 as in your other workflow with ic lora
- Added π π £π § Add Video IC-LoRA Guide after the π π £π § LTXV Add Latent Guide node.
- I am calling LTXVCropGuides connected to the ic lora first before calling it on the add latent (I thought that would make sense because IC-LoRA Guide was added after the Latent Guide Node.
And I think those are the big changes I have done.
I wanted to go with the ic-lora guide from the first frame till the video end but wanted to experiment with the loop first for a proof of concept.
The use case for such a workflow is common I think.
You have a person talking and gesturing and wanted to capture the audio and body movement to the new character in ltx. I think a way to add a character sheet guide to -1 then cropping it would make sense because if the character rotate his head and do significant movements it won't be consistent from my experience.
Any help is greatly appreciated.
Should be doable.
The "tricky" part is to match the control video input (pose video) to that of the generated video.
But thats basically just using the frames generated at say part one... and then in the extending group, start that pose video from where it ended in previous part.
And do all over again, as in first part, just starting at the correct frame
The long video workflows is basically just generating a video in regular way, then doing it again, and again... and stitching them at the end
The challenge is getting the math right, since there are overlapping frames to continue motion and have consistency. That are then chopped off, when previous video is stitched with next video
What problems are you running into though? ComfyUI errors? or the output video not living up to expectations?
LTX is very sensitive to the pose height and with being correct (but x32 is right). Regular frames it auto adjust internally should the width or height be incompatible, but with pose it just crashes if wrong
Been meaning to take a new look at those long workflows, they are a bit more messy and complicated than they need to be.
Learned a few things since, that could be applied to make them easier.
Will do as soon as i have a chance ;-)
The output video has the skeleton itself rather than controlling the character. I have probably connected something wrong.
Whole video ? or just parts of it ?
If the skeleton appears at the whole video output, either something is wired wrong (hard to guess where), OR the lora is not applied to the model that is fed to the sampler
The model picker inside the subgraph (probably a GET node connected to the subgraph), should be the model + lora.
And if its just partly skeleton mixed with regular video output, then a LTX crop node is missing, that crops off the guides at the end of the wf
That being said, i had my fair share of struggles with the long video wf making it.
And equally fair share of struggles with the IC-Union control wf. Doing both combined, its a challenge i bet
It shows most of the first video (outside the iteration) then the skeleton, then back to the character but not really controlled by the skeleton.
It shows most of the first video (outside the iteration) then the skeleton, then back to the character but not really controlled by the skeleton.
Little hard to say, but that sounds a bit like a combo of both
- the model link used is probably without the lora connected (the lora completely removes the skeleton in the sampler preview, you should not see it there, if the lora with model is connected)
- the 2nd pass does not have a LTX crop guides node to crop off the skeleton before the upscale node (and if its a single pass workflow, the ltx crop guide nodes is before the vae decode)
Edit: looking at your screenshot it does look like you have the crop guides there. But make sure the positive and negative noodles pass from one node to the next. This can be a bit tricky
I might also try make one, when giving the long workflows some love soon.
Thank you for your response.
I couldn't attach the json workflow, so I uploaded the video because it should have my version of the workflow.Will take a look if the workflow is in the meta
I found the problem.
I didn't connect the ltx_latent_df to "π π £π § Add Video IC-LoRA Guide"'s latent_downscale_factor. After connecting it I got an error about latent size not divisible by 2, I changed the resolution of the video to 768x448 to be divisible by 64 and it worked!
Ah good good.. I forgot about this one. But you figured it out ;-)

