How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("Wan-AI/Wan2.1-FLF2V-14B-720P", dtype=torch.bfloat16, device_map="cuda")
pipe.load_lora_weights("maxwelljones14/refVFX-lora")

prompt = "A man with short gray hair plays a red electric guitar."
input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png")

image = pipe(image=input_image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")

refVFX LoRA

LoRA adapter for Wan-AI/Wan2.1-FLF2V-14B-720P trained to transfer a temporal visual effect from a reference video onto a separate input image or video.

Files

File Description
epoch-0.safetensors LoRA model.

Training

  • Base model: Wan-AI/Wan2.1-FLF2V-14B-720P
  • LoRA rank: 1024
  • Target modules: q, k, v, o, ffn.0, ffn.2 (applied to the DiT)
  • Learning rate: 4e-5, 200-step linear warmup
  • Frames per clip: 33
  • Max pixels: 399,360
  • Optimizer parallelism: DeepSpeed ZeRO-1, 8 ranks
  • CFG dropout: p_drop_ref = 0.05, p_drop_control_video = 0.05

Trained on maxwelljones14/refVFX_dataset (code-based edits + neural V2V edits + I2V LoRA effects, sampled as triplets).

Usage

Load the weights into a Wan2.1-FLF2V pipeline and inject them as a LoRA on the DiT (target modules above, remove_prefix_in_ckpt="pipe.dit."). See infer_refvfx.py in the refVFX trainer repo for a reference implementation.

License

Inherits the base-model license from Wan-AI/Wan2.1-FLF2V-14B-720P. Use is subject to its terms.

Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maxwelljones14/refVFX-lora

Adapter
(1)
this model