mignonjia
/

hyworld

HyWorldPipeline

Model card Files Files and versions

hyworld / README.md

rand0nmr's picture

init

2ed43ac 9 days ago

|

history blame contribute delete

3.15 kB

	---
	license: other
	tags:
	- image-to-video
	---

	Hunyuan1.5 use attention masks with variable-length sequences. For best performance, we recommend using an attention backend that handles padding efficiently.

	We recommend installing [kernels](https://github.com/huggingface/kernels) (`pip install kernels`) to access prebuilt attention kernels.

	You can check our [documentation](https://huggingface.co/docs/diffusers/main/en/optimization/attention_backends) to learn more about all the different attention backends we support.


	```py
	import torch

	dtype = torch.bfloat16
	device = "cuda:0"
	from diffusers import HunyuanVideo15ImageToVideoPipeline, attention_backend
	from diffusers.utils import export_to_video, load_image

	pipe = HunyuanVideo15ImageToVideoPipeline.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_i2v", torch_dtype=dtype)
	pipe.enable_model_cpu_offload()
	pipe.vae.enable_tiling()

	generator = torch.Generator(device=device).manual_seed(1)
	image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG")
	prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
	with attention_backend("_flash_3_hub"): # or `"flash_hub"` if you are not using H100/H800
	video = pipe(
	prompt=prompt,
	image=image,
	generator=generator,
	num_frames=121,
	num_inference_steps=50,
	).frames[0]
	export_to_video(video, "output.mp4", fps=24)
	```

	To use default attention backend

	```py
	import torch

	dtype = torch.bfloat16
	device = "cuda:0"
	from diffusers import HunyuanVideo15ImageToVideoPipeline
	from diffusers.utils import export_to_video, load_image

	pipe = HunyuanVideo15ImageToVideoPipeline.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_i2v", torch_dtype=dtype)
	pipe.enable_model_cpu_offload()
	pipe.vae.enable_tiling()

	generator = torch.Generator(device=device).manual_seed(1)
	image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG")
	prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

	video = pipe(
	prompt=prompt,
	image=image,
	generator=generator,
	num_frames=121,
	num_inference_steps=50,
	).frames[0]
	export_to_video(video, "output.mp4", fps=24)
	```