Upload folder using huggingface_hub

ac2243f verified about 1 month ago

5.02 kB

	<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License. -->

	# HunyuanImage2.1


	HunyuanImage-2.1 is a 17B text-to-image model that is capable of generating 2K (2048 x 2048) resolution images

	HunyuanImage-2.1 comes in the following variants:

	\| model type \| model id \|
	\|:----------:\|:--------:\|
	\| HunyuanImage-2.1 \| [hunyuanvideo-community/HunyuanImage-2.1-Diffusers](https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Diffusers) \|
	\| HunyuanImage-2.1-Distilled \| [hunyuanvideo-community/HunyuanImage-2.1-Distilled-Diffusers](https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Distilled-Diffusers) \|
	\| HunyuanImage-2.1-Refiner \| [hunyuanvideo-community/HunyuanImage-2.1-Refiner-Diffusers](https://huggingface.co/hunyuanvideo-community/HunyuanImage-2.1-Refiner-Diffusers) \|

	> [!TIP]
	> [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

	## HunyuanImage-2.1

	HunyuanImage-2.1 applies [Adaptive Projected Guidance (APG)](https://huggingface.co/papers/2410.02416) combined with Classifier-Free Guidance (CFG) in the denoising loop. `HunyuanImagePipeline` has a `guider` component (read more about [Guider](../modular_diffusers/guiders.md)) and does not take a `guidance_scale` parameter at runtime. To change guider-related parameters, e.g., `guidance_scale`, you can update the `guider` configuration instead.

	```python
	import torch
	from diffusers import HunyuanImagePipeline

	pipe = HunyuanImagePipeline.from_pretrained(
	"hunyuanvideo-community/HunyuanImage-2.1-Diffusers",
	torch_dtype=torch.bfloat16
	)
	pipe = pipe.to("cuda")
	```

	You can inspect the `guider` object:

	```py
	>>> pipe.guider
	AdaptiveProjectedMixGuidance {
	"_class_name": "AdaptiveProjectedMixGuidance",
	"_diffusers_version": "0.36.0.dev0",
	"adaptive_projected_guidance_momentum": -0.5,
	"adaptive_projected_guidance_rescale": 10.0,
	"adaptive_projected_guidance_scale": 10.0,
	"adaptive_projected_guidance_start_step": 5,
	"enabled": true,
	"eta": 0.0,
	"guidance_rescale": 0.0,
	"guidance_scale": 3.5,
	"start": 0.0,
	"stop": 1.0,
	"use_original_formulation": false
	}

	State:
	step: None
	num_inference_steps: None
	timestep: None
	count_prepared: 0
	enabled: True
	num_conditions: 2
	momentum_buffer: None
	is_apg_enabled: False
	is_cfg_enabled: True
	```

	To update the guider with a different configuration, use the `new()` method. For example, to generate an image with `guidance_scale=5.0` while keeping all other default guidance parameters:

	```py
	import torch
	from diffusers import HunyuanImagePipeline

	pipe = HunyuanImagePipeline.from_pretrained(
	"hunyuanvideo-community/HunyuanImage-2.1-Diffusers",
	torch_dtype=torch.bfloat16
	)
	pipe = pipe.to("cuda")

	# Update the guider configuration
	pipe.guider = pipe.guider.new(guidance_scale=5.0)

	prompt = (
	"A cute, cartoon-style anthropomorphic penguin plush toy with fluffy fur, standing in a painting studio, "
	"wearing a red knitted scarf and a red beret with the word 'Tencent' on it, holding a paintbrush with a "
	"focused expression as it paints an oil painting of the Mona Lisa, rendered in a photorealistic photographic style."
	)

	image = pipe(
	prompt=prompt,
	num_inference_steps=50,
	height=2048,
	width=2048,
	).images[0]
	image.save("image.png")
	```


	## HunyuanImage-2.1-Distilled

	use `distilled_guidance_scale` with the guidance-distilled checkpoint,

	```py
	import torch
	from diffusers import HunyuanImagePipeline
	pipe = HunyuanImagePipeline.from_pretrained("hunyuanvideo-community/HunyuanImage-2.1-Distilled-Diffusers", torch_dtype=torch.bfloat16)
	pipe = pipe.to("cuda")

	prompt = (
	"A cute, cartoon-style anthropomorphic penguin plush toy with fluffy fur, standing in a painting studio, "
	"wearing a red knitted scarf and a red beret with the word 'Tencent' on it, holding a paintbrush with a "
	"focused expression as it paints an oil painting of the Mona Lisa, rendered in a photorealistic photographic style."
	)

	out = pipe(
	prompt,
	num_inference_steps=8,
	distilled_guidance_scale=3.25,
	height=2048,
	width=2048,
	generator=generator,
	).images[0]

	```


	## HunyuanImagePipeline

	[[autodoc]] HunyuanImagePipeline
	- all
	- __call__

	## HunyuanImageRefinerPipeline

	[[autodoc]] HunyuanImageRefinerPipeline
	- all
	- __call__


	## HunyuanImagePipelineOutput

	[[autodoc]] pipelines.hunyuan_image.pipeline_output.HunyuanImagePipelineOutput