Workflow : V2V Head Swap (experimental)

#48

by RuneXX - opened Feb 6

Feb 6

•

Mostly meant for a bit of fun ;-) and result can vary a lot depending on reference video and reference image
Based around the BFS lora's from @Alissonerdx , and he got some nice workflows too (hopefully he wont mind I re-did his a little bit, since his workflows seems based on mine ;-) And its a bit experimental at this stage, so might tweak it a bit later)

The workflow has a lot of nodes, and can be a bit heavy, might not work for those with low ram.

LTX models based on Kijai's extracted models : https://huggingface.co/Kijai/LTXV2_comfy

Needed extra models needed:
Flux Klein (used to create a first frame reference image with head swap): https://docs.comfy.org/tutorials/flux/flux-2-klein

Needed loras:
Flux Klein Head Swap lora (for first frame) : https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap
LTX-2 Head Swap lora : https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video

Alternative workflows: https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/tree/main/workflows

And use responsively ;-) don't go making bad things

An alternative way for those feeling adventures and want to try new things, you could probably also achieve some good results just face-swapping the first frame image, and then use the canny or pose workflow with the video input as a control video ;-)

RuneXX

Feb 6

If doing the extra step of using a voice changer (for example Chatterbox comfyui nodes)

Alissonerdx

Feb 6

•

edited Feb 6

The results would likely be better if I used Qwen Image Edit 2511 to perform the head swap on the first frame. I kept Flux Klein because it’s more widely supported by most workflows, but ideally the first-frame head swap should be done with QIE.

Another point is that my training dataset contains many more landscape videos than portrait ones, even though most of my examples are portrait.

Yes, I’m using your workflow — thank you very much, by the way 😄

At the beginning, I wasn’t doing a second pass, and I wasn’t even downscaling. Without the second pass, the results actually felt more consistent. However, the downside was that I couldn’t scale the resolution properly, so I eventually had to add it. After introducing the second pass, the results started to depend a bit more on luck.

Finally, there is a real issue where details from the original person sometimes try to reappear in the video. This happens because the only anchor point for the new head is the first frame. All the other frames still contain the original face. This is currently my biggest challenge for this first version.

At this stage, I really need people who are strong in workflow for LTX to help explore what is possible with V1. In parallel, I’ve already started training V2.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment