RuneXX/LTX-2.3-Workflows · Workflow: V2V Just Dub It - lip synced multi-language dubbing with IC-Lora-LipDub

RuneXX

Owner May 11

•

edited May 12

Italian

Swedish

German

Spanish

V2V Just Dub It - lip synced multi-language dubbing with IC-Lora-LipDub

Translate any video with LTX official LipDub lora, based on the JustDubIt paper.
Lora available here: https://huggingface.co/Lightricks/LTX-2.3-22b-IC-LoRA-LipDub

And workflow to try here: https://huggingface.co/RuneXX/LTX-2.3-Workflows/tree/main/Video-2-Video

You need the very latest of ComfyUI-LTXVideo. (update in comfyui manager, or install if you dont have already)

Portland01

May 12

missing node - LTXVSetAudioRefTokens
When I click on install missing nodes in comfy manager, nothing comes up as missing. Any clue what github this is on where I can install it through python? Thanks. Also, how is this different from your recut workflow? Sounds like this does the exact same thing except for the language change.

RuneXX

Owner May 12

•

edited May 12

missing node - LTXVSetAudioRefTokens

Try update ComfyUI-LTXVideo.
They made soem custom tokens for this feature.

Just search LTX in comfy manager and update, or install
I'll add a note in the original post, forgot about that one

PeZiK

May 12

Hi RuneXX,

Is it possible to have custom audio with this new dub ic-lora?

ykk648

May 13

I don't see any node in this workflow load the IC Lora model.

RuneXX

Owner May 13

I don't see any node in this workflow load the IC Lora model.

Look at top left, the Power Lora loader

Portland01

May 13

Thanks RuneXX. Updating ComfyUI-LTXVideo through the manager fixed it. In your opinion, if not using the language change, is there any benefit/difference in using this over your recut workflow which doesn't need a lora?

RuneXX

Owner May 13

•

edited May 13

Both way works, the LTX model seems to have built-in dubbing feature. As you said, it works even without the lora.
But seemed a bit more natural, faithful to the original and polished perhaps with the lora. And dont need any masking or anything like that with the lora.
Only did a few test runs though.

Portland01

May 13

•

edited May 13

Thanks for the quick response. It does a great job with the lip sync but I find the voice completely changes from the original. Maybe I'm doing something wrong. Your retake nails it or at least stays very close to the original voice.

RuneXX

Owner May 13

•

edited May 13

Yes agree it changes a bit. Maybe could add a voice clone at the very end, will try that later ;-)
The wf as is, is how LTX made it, but can always add some extras to see if it works even better

That being said, i see they updated the lora. Not sure if the updated lora works better (but you might already have the latest, depending on when you downloaded it).
It also says version 0.9, so maybe it might get further improvements also

Portland01

May 13

I have the latest version and tried many different settings. Unfortunately it fails with cloning the voice. It is great for the language change.

Got another question for you Rune. You obviously do way more testing then most when it comes to generating videos. I noticed that you tend to put all your schedulers as linear_quadratic now. Do you find this gives cleaner audio compared to the others? I also noticed you put your random noise as fixed. Is there a reason for this? I always found random for both passes to be best. Just curious about your thoughts on these two settings.

RuneXX

Owner May 13

•

edited May 13

tend to put all your schedulers as linear_quadratic now

I use that a lot myself, but its usually "hidden option" under the regular 8 step manual sigma. I put it below/under the manual sigma as an option that you can easily connect to sampler
The main benefit of the Basic scheduler (with linear_quadratic or other ), is that you add more steps easily.

8 step can be a bit optimistic, perhaps.. i cant even make a decent single image in some image models with that few steps ;-)
So for higher action scenes, or just more complex scenes that close up portraits, LTX can benefit from a few more steps.
So thats the only reason its there ;-)

And some workflows can really benefit from more steps most all of the time.. like those masking workflows, re-take etc.. Probably why i left it there as "default" (most all other workflows, i just put the 8 step manual sigma, the default LTX way, and leave the Basic Scheduler as option)

Madein72

May 13

I noted that on the official LTX worfklow for ICLoRA libdub, they use these manual sigma's for the 2'nd pass: 0.909375, 0.725, 0.421875, 0.0 and not the "usual": 0.85, 0.7250, 0.4219, 0.0

I assume this is due to extra denoising for lipdub perhaps?

RuneXX

Owner May 13

I didnt notice that part, assumed it was the standard sigmas. Nice catch ;-) since they changed it its probably for a reason. Will add to the wf

Xilsac

May 13

RuneXX, is it possible to use audio files instead of promt?

RuneXX

Owner May 14

•

edited May 14

RuneXX, is it possible to use audio files instead of promt?

It kinda does already. It uses the audio of the video input. The prompting part is just transcribing the audio in the video to another language (dubbing)
You mean a silent video input and add sound?

Xilsac

May 14

RuneXX, is it possible to use audio files instead of promt?

It kinda does already. It uses the audio of the video input. The prompting part is just transcribing the audio in the video to another language (dubbing)
You mean a silent video input and add sound?

Yes, a silent video input and add sound. Just like in the workflow LTX-2.3_-_V2V_Just_Talk_custom_audio_lip-synced_to_any_video.json, but only with DubLip lora and the corresponding DubLip workflow.

RuneXX

Owner May 15

Yes, a silent video input and add sound. Just like in the workflow LTX-2.3_-_V2V_Just_Talk_custom_audio_lip-synced_to_any_video.json, but only with DubLip lora and the corresponding DubLip workflow.

Will give it a try. The reference audio would then have to be masked in (as with any custom audio workflow).
I do think that the DubLip lora will then not "hear" the audio, but could be it works.. will try ;-)

Xilsac

May 15

Yes, a silent video input and add sound. Just like in the workflow LTX-2.3_-_V2V_Just_Talk_custom_audio_lip-synced_to_any_video.json, but only with DubLip lora and the corresponding DubLip workflow.

Will give it a try. The reference audio would then have to be masked in (as with any custom audio workflow).
I do think that the DubLip lora will then not "hear" the audio, but could be it works.. will try ;-)

Thank you for your feedback!

PeZiK

May 15

Hi, I don't really know what I am doing but I got it to work with custom audio. I just piped the custom audio as audio original and some other spaghetti work - see the attached pic for reference.

PeZiK

May 15

I'm trying to figure out how to amplify the 'Lipsync Dub Lora 0.9' because it's too weak with my current LoRA setup. My characters barely open their mouths. I tried adding a 'Latent Multiply' node after 'Set Latent Noise Mask', but it introduces artifacts.

Since the Lipsync LoRA's strength maxes out at 1.0 - does anyone have ideas on how to boost the lipsync effect?

RuneXX

Owner May 15

•

edited May 15

Hi, I don't really know what I am doing but I got it to work with custom audio.

That looks all good, from a quick look... ;-) I guess it works then, -I didnt have chance to test yet

I'm trying to figure out how to amplify the 'Lipsync Dub Lora 0.9' because it's too weak with my current LoRA setup. My characters barely open their mouths.

You I dont think you can "amplify" the latent.
What you can try is increase the steps. Under the manual sigma node I usually "hide" the Basic Scheduler node, so you can connect that as sigma to the sampler instead.
And then easily adjust the steps. Try something like 10-15 steps. That should improve.. hopefully

Also the prompt matter, so in the prompt write something like: And then the man talks, and he says : "...... transcribe the words spoken to the language of choice.."

That should hopefully fix it. And if not, you can also adjust the CFG, try set it to 1.5 to 3..

abhirajtulsyan

28 days ago

In this workflow, if I add text will the guy say it? rather than dubbing it? and will it be applicable if two people are there?

RuneXX

Owner 28 days ago

In this workflow, if I add text will the guy say it? rather than dubbing it? and will it be applicable if two people are there?

havent tried myself, but I am pretty sure that will work fine.
And reference each in the prompt. The woman to the lefts says : "como estas". Then the man to the right says : "estoy bien" .. .etc

abhirajtulsyan

25 days ago

•

edited 25 days ago

Oaky will check out.

abhirajtulsyan

25 days ago

The voice changes a lot, it shifts the tone and voice completely and lip movement also is not coming proper. I am using the default workflow which you have given with no changed workflows

RuneXX

Owner 25 days ago

The voice changes a lot, it shifts the tone and voice completely and lip movement also is not coming proper. I am using the default workflow which you have given with no changed workflows

yes its a bit of change in the voice. The lipsync should be ok though.
In the "old" wf before this lora, I added a voice clone at the very end, that took the LTX dubbed audio, ran it through a voice clone reference, and the output becomes quite a bit more alike.
Will see if i can add that to this wf as well, as optional.

For lip-sync, perhaps try do more steps, if it struggles with some generations (I'll double check if there is any error, like FPS mismatch, just in case.. )

abhirajtulsyan

24 days ago

Okay let me try.

abhirajtulsyan

24 days ago

In the "old" wf before this lora, I added a voice clone at the very end, that took the LTX dubbed audio, ran it through a voice clone reference, and the output becomes quite a bit more alike.
Will see if i can add that to this wf as well, as optional.

can you add in this also? because right now voice is like just very very different

abhirajtulsyan

24 days ago

Also for me lip movement is just not right. I am using the exact workflow, have the video with english, trying to say a different dialogue so typing it out.
But the lips movement are very inaccurate. really.

RuneXX

Owner 23 days ago

can you add in this also? because right now voice is like just very very different

Yes, will add. It can be optional. Just a simple toggle to turn on off.
It does help quite a bit. Specially with voices you already know well.

Also for me lip movement is just not right.

In fairness its an early version lora, beta version of sorts... . v0.9
But seemed to be ok, will do a few test runs here too.... and at least rule out if there is any fps mismatch