maxwelljones14
/

refVFX-lora

Model card Files Files and versions

refVFX-lora / README.md

maxwelljones14's picture

Add model card

6c4fa96 verified 18 days ago

|

history blame contribute delete

1.43 kB

	---
	license: other
	license_name: wan-community
	base_model: Wan-AI/Wan2.1-FLF2V-14B-720P
	library_name: diffusers
	tags:
	- lora
	- video
	- image-to-video
	- wan
	- refvfx
	---

	# refVFX LoRA

	LoRA adapter for [`Wan-AI/Wan2.1-FLF2V-14B-720P`](https://huggingface.co/Wan-AI/Wan2.1-FLF2V-14B-720P) trained to transfer a temporal visual effect from a reference video onto a separate input image or video.

	## Files

	\| File \| Description \|
	\| --- \| --- \|
	\| `epoch-0.safetensors` \| LoRA model. \|

	## Training

	- Base model: `Wan-AI/Wan2.1-FLF2V-14B-720P`
	- LoRA rank: 1024
	- Target modules: `q, k, v, o, ffn.0, ffn.2` (applied to the DiT)
	- Learning rate: 4e-5, 200-step linear warmup
	- Frames per clip: 33
	- Max pixels: 399,360
	- Optimizer parallelism: DeepSpeed ZeRO-1, 8 ranks
	- CFG dropout: `p_drop_ref = 0.05`, `p_drop_control_video = 0.05`

	Trained on [`maxwelljones14/refVFX_dataset`](https://huggingface.co/datasets/maxwelljones14/refVFX_dataset) (code-based edits + neural V2V edits + I2V LoRA effects, sampled as triplets).

	## Usage

	Load the weights into a Wan2.1-FLF2V pipeline and inject them as a LoRA on the DiT (target modules above, `remove_prefix_in_ckpt="pipe.dit."`). See `infer_refvfx.py` in the [refVFX trainer repo](https://github.com/) for a reference implementation.

	## License

	Inherits the base-model license from `Wan-AI/Wan2.1-FLF2V-14B-720P`. Use is subject to its terms.