Duplicate from Alissonerdx/BFS-Best-Face-Swap-Video

Browse files

Co-authored-by: Alisson Pereira Anjos <Alissonerdx@users.noreply.huggingface.co>

Files changed (15) hide show

.gitattributes +77 -0
README.md +310 -0
examples/1.mp4 +3 -0
examples/2.mp4 +3 -0
examples/3.mp4 +3 -0
examples/4.mp4 +3 -0
examples/5.mp4 +3 -0
ltx-2/download-models-head-swap-ltx2-windows.ps1 +3 -0
ltx-2/download-models-head-swap-ltx2.sh +3 -0
ltx-2/head_swap_v1_13500_first_frame.safetensors +3 -0
ltx-2/head_swap_v1_8750_first_and_last_frame.safetensors +3 -0
ltx-2/head_swap_v2_multimodes.safetensors +3 -0
workflows/workflow_ltx2_head_swap_drag_and_drop.json +0 -0
workflows/workflow_ltx2_head_swap_drag_and_drop_v1.1.json +0 -0
workflows/workflow_ltx2_head_swap_drag_and_drop_v2.0.json +0 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,77 @@

+<<<<<<< HEAD
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+ltx-2/ filter=lfs diff=lfs merge=lfs -text
+=======
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+ltx-2/ filter=lfs diff=lfs merge=lfs -text
+>>>>>>> 47487e3e7ace468206bad3d0247ed81b792bf222
+examples/ filter=lfs diff=lfs merge=lfs -text
+workflows/workflow_head_swap_drag_and_drop.png filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,310 @@

+---
+license: other
+license_name: ltx-2-community-license-agreement
+tags:
+- ltx-2
+- ic-lora
+- head-swap
+- video-to-video
+- image-to-video
+- bfs
+- lora
+base_model:
+- Lightricks/LTX-2
+library_name: diffusers
+pipeline_tag: image-to-video
+---
+## ⚠️ Ethical Use & Disclaimer
+This model is a technical tool designed for **Digital Identity Research, Professional VFX Workflows, and Cinematic Prototyping.**
+By downloading or using this LoRA, you acknowledge and agree to the following:
+* **Intended Use:** Designed for filmmakers, VFX artists, and researchers exploring high-fidelity video identity transformation.
+* **Consent & Rights:** You must possess explicit legal consent and all necessary rights from any individual whose likeness is being processed.
+* **Legal Compliance:** You are fully responsible for complying with all local and international laws regarding synthetic media.
+* **Liability Waiver:** This model is provided *“as is.”*
+  **As the creator (Alissonerdx), I assume no responsibility for misuse.**
+  Any legal, ethical, or social consequences are solely the responsibility of the end user.
+---
+# 📺 Video Examples (V1)
+Generated using the **Frame 0 Anchoring Technique**.
+All examples follow the guide video motion while preserving the identity provided in the first frame.
+| Example 1                                                                                                                                  | Example 2                                                                                                                                  |
+| ------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
+| <video src="https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/resolve/main/examples/1.mp4" controls autoplay loop muted></video> | <video src="https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/resolve/main/examples/2.mp4" controls autoplay loop muted></video> |
+| Example 3                                                                                                                                  | Example 4                                                                                                                                  |
+| ------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
+| <video src="https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/resolve/main/examples/3.mp4" controls autoplay loop muted></video> | <video src="https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/resolve/main/examples/4.mp4" controls autoplay loop muted></video> |
+| Example 5                                                                                                                                  |
+| ------------------------------------------------------------------------------------------------------------------------------------------ |
+| <video src="https://huggingface.co/Alissonerdx/BFS-Best-Face-Swap-Video/resolve/main/examples/5.mp4" controls autoplay loop muted></video> |
+---
+# 🛠 Technical Background (V1)
+To achieve this level of identity transfer, I **heavily modified the official LTX-2 training scripts**.
+### Key Improvements
+* **Novel Conditioning Injection:** Custom latent injection methods for reference identity stabilization.
+* **Noise Distribution Overhaul:** Implemented a **custom High-Noise Power Law timestep distribution**, forcing the model to prioritize target identity reconstruction over guide-video context.
+* **Training Compute:** 60+ hours of training on **NVIDIA RTX PRO 6000 Blackwell GPUs**, iterating through 300GB+ of experimental checkpoints.
+---
+# 📊 Dataset Specifications
+## V1 Dataset
+* **300 high-quality head swap video pairs**
+* Trained on **512x512 buckets**
+* Primarily **Landscape format**
+* Optimized for **close-up framing**
+Wide shots may reduce identity fidelity.
+---
+# 💡 Inference Guide (V1)
+## 🔴 CRITICAL — Frame 0 Requirement
+This version was trained to use **Frame 0 as the identity anchor**.
+You MUST prepare the first frame correctly.
+### Recommended Workflow
+1. Perform a high-quality head swap on Frame 0.
+2. Use that processed frame as conditioning input.
+3. Run the full video generation.
+For best results, prepare Frame 0 using my previous **BFS Image Models**.
+---
+## Optimization
+### LoRA Strength
+* **1.0** → Best motion fidelity
+* **>1.0** → Stronger identity & hair capture but may distort original motion
+### Multi-Pass Workflows
+You can experiment with multiple passes using different strengths.
+### Prompting
+Detailed prompts currently have **no effect**.
+Trigger remains:
+```
+head swap
+```
+---
+# ⚠️ Known Issues (V1 – Alpha)
+* **Identity Leakage:** Hair from the guide video may reappear.
+* **Hard Cuts:** Jump cuts can reset identity.
+* **Portrait Format:** Performance significantly better in landscape.
+---
+# 🚀 Version 2 – Major Update
+V2 introduces a **complete redesign of conditioning strategy and masking logic**, significantly improving identity robustness and reducing leakage.
+---
+## 🔹 Multiple Conditioning Modes (Using First Frame)
+V2 supports multiple identity injection approaches:
+### 1️⃣ Direct Photo Conditioning
+Use a clean photo of the new face as reference input.
+This method works and can produce strong results. However, because the model must internally reconcile lighting, perspective, depth, and occlusion differences, it may need to "fight" to correctly integrate the new identity into the guide video. In some cases, this can reduce stability or identity consistency.
+### 2️⃣ First-Frame Head Swap (Recommended)
+Applying a proper head swap on Frame 0 still produces **extremely strong and reliable results**.
+Because the first frame is already structurally correct (pose, lighting, depth, occlusions), the model has significantly less work to do. Instead of forcing alignment from a static photo, it simply propagates and stabilizes the identity through time.
+This approach generally:
+* Produces higher identity fidelity
+* Reduces deformation
+* Minimizes integration artifacts
+* Improves overall temporal stability
+### 3️⃣ Automatic Magazine-Style Overlay
+The new face is automatically cut and positioned over the guide face using mask alignment.
+This simulates a "magazine cutout" overlay, but performed automatically based on mask positioning.
+### 4️⃣ Manual Overlay
+Advanced users may manually composite the new face over Frame 0 before running inference.
+Advanced users may manually composite the new face over Frame 0 before running inference.
+---
+## 🔹 Facial Motion Behavior (Important Change)
+Unlike V1:
+**V2 does NOT follow the original guide face’s facial micro-movements.**
+The guide face is fully masked to prevent identity leakage.
+This makes masking quality critical.
+### Mask Requirements
+* The guide face MUST be completely covered.
+* Mask color must be **magenta tone**.
+* Any visible guide identity may leak into the final output.
+---
+## 🔹 Mask Types
+Users may alternate between:
+### ▪ Square Masks
+* More stable identity
+* Better consistency
+* Often produce stronger overall results
+* May generate slightly oversized heads due to spatial padding
+In most scenarios, square masks tend to perform better because they provide additional spatial context for the model to reconstruct structure and hair.
+### ▪ Tight / Adjusted Masks
+* More natural head proportions
+* May deform if guide head shape differs significantly
+* Sensitive to long-hair mismatches
+If the original guide has long hair and the new identity does not, deformation risk increases.
+If the original guide has long hair and the new identity does not, deformation risk increases.
+---
+## 🔹 Dataset & Training Improvements (V2)
+* **800+ video pairs**
+* Trained at **768 resolution**
+* 768 is the recommended inference resolution
+* Improved hair stability
+* Reduced identity leakage compared to V1
+* More robust identity transfer under motion
+---
+## 🔹 First Pass vs Second Pass
+You may:
+* Run single pass at 768 (recommended)
+* Or run a downscaled first pass + second upscale pass
+⚠️ Important:
+Second pass may alter identity from the first pass and reduce consistency in some cases.
+---
+## 🔹 Trigger
+Trigger remains:
+```
+head swap
+```
+---
+# 🎬 Upcoming Demonstration Video
+A full workflow breakdown will be shared soon, covering:
+* Mask preparation best practices
+* Conditioning variations comparison
+* First pass vs second pass differences
+* Failure cases and correction strategies
+---
+---
+# 🔴 Critical Success Factor (V2)
+In this new version, the **single most important factor is mask quality**.
+Everything depends on the mask.
+* Absolutely no detail from the original guide face can leak.
+* There must be no visible facial fragments.
+* Avoid small holes, gaps, or transparency artifacts.
+* Ensure full coverage of skin, facial hair, eyebrows, and hairline when necessary.
+If any portion of the original identity remains visible, the model may reintroduce it during generation.
+Mask precision directly determines:
+* Identity stability
+* Leakage prevention
+* Deformation resistance
+* Overall realism
+Take time to refine your mask. A high-quality mask will produce dramatically better results than increasing LoRA strength.
+---
+## 🔧 Advanced Technique: Combine with LTX-2 Inpainting
+Advanced users can experiment with combining this LoRA with the native **LTX-2 inpainting workflow**.
+This can help:
+* Refine problematic areas
+* Correct small deformation zones
+* Improve edge blending
+* Recover detail in hair or jaw regions
+When properly combined, inpainting can significantly enhance final output quality, especially in challenging frames.
+---
+# 💙 Support
+Maintaining R&D and renting Blackwell GPUs is expensive.
+If this project helps you, consider supporting the development of:
+* V3 improvements
+* Advanced conditioning pipelines
+* SAM 3 integration
+* Full reference-photo-only workflows
+Support here:
+[https://buymeacoffee.com/nrdx](https://buymeacoffee.com/nrdx)

examples/1.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bb5008a5750697b2ba6a105813455861c8f84949762a9a919c4f423b60bfc124
+size 30056875

examples/2.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fd30caefa2bbefe1bdc8927855bf472dce67bdb83e25eaa46255f073f4f79885
+size 31230709

examples/3.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:545e6962243768f4a448f92e8ae72037d3f36a5d26eb2684595e79b045dbe500
+size 66240119

examples/4.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1166856da2aee972247473feca89a3d2f31a90d8bbeeb882de3c5664f07fa1e2
+size 38867419

examples/5.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7437dfd7d03ec41af05d1fa71b2d3b1fe107f4fa87b8e38ebc66b573ec16c66d
+size 38173397

ltx-2/download-models-head-swap-ltx2-windows.ps1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:40f0dded61a782ccdc19c30d9ead9bef69a5a53a05a6fc596ee7e02efedbde97
+size 4146

ltx-2/download-models-head-swap-ltx2.sh ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6427e10f9a4d9141c728ebef8e9c0908700bd94726be53effc85ecb99a78423b
+size 5119

ltx-2/head_swap_v1_13500_first_frame.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:056373cf73418dac449fecf34a5b749deeb802a1a4a4a9fc1677cd46c2d48864
+size 1308756368

ltx-2/head_swap_v1_8750_first_and_last_frame.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:48d16179c82629385fb5a812ed33182952e7755a60464ffe645d3417f5d48a71
+size 1308756368

ltx-2/head_swap_v2_multimodes.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f459e03568447dcc5d6ea7c02466fece5eee3409bb23f07dfc2ecab24ac7a2fa
+size 1316096704

workflows/workflow_ltx2_head_swap_drag_and_drop.json ADDED Viewed

The diff for this file is too large to render. See raw diff

workflows/workflow_ltx2_head_swap_drag_and_drop_v1.1.json ADDED Viewed

The diff for this file is too large to render. See raw diff

workflows/workflow_ltx2_head_swap_drag_and_drop_v2.0.json ADDED Viewed

The diff for this file is too large to render. See raw diff