Buckets:

hf-doc-build/doc-dev / diffusers /pr_11739 /en /api /models /z_image_transformer2d.md
rtrm's picture
|
download
raw
1.99 kB

ZImageTransformer2DModel

A Transformer model for image-like data from Z-Image.

ZImageTransformer2DModel[[diffusers.ZImageTransformer2DModel]]

diffusers.ZImageTransformer2DModel[[diffusers.ZImageTransformer2DModel]]

Source

forwarddiffusers.ZImageTransformer2DModel.forwardhttps://github.com/huggingface/diffusers/blob/vr_11739/src/diffusers/models/transformers/transformer_z_image.py#L888[{"name": "x", "val": ": typing.Union[typing.List[torch.Tensor], typing.List[typing.List[torch.Tensor]]]"}, {"name": "t", "val": ""}, {"name": "cap_feats", "val": ": typing.Union[typing.List[torch.Tensor], typing.List[typing.List[torch.Tensor]]]"}, {"name": "return_dict", "val": ": bool = True"}, {"name": "controlnet_block_samples", "val": ": typing.Optional[typing.Dict[int, torch.Tensor]] = None"}, {"name": "siglip_feats", "val": ": typing.Optional[typing.List[typing.List[torch.Tensor]]] = None"}, {"name": "image_noise_mask", "val": ": typing.Optional[typing.List[typing.List[int]]] = None"}, {"name": "patch_size", "val": ": int = 2"}, {"name": "f_patch_size", "val": ": int = 1"}]

Flow: patchify -> t_embed -> x_embed -> x_refine -> cap_embed -> cap_refine -> [siglip_embed -> siglip_refine] -> build_unified -> main_layers -> final_layer -> unpatchify

patchify_and_embed[[diffusers.ZImageTransformer2DModel.patchify_and_embed]]

Source

Patchify for basic mode: single image per batch item.

patchify_and_embed_omni[[diffusers.ZImageTransformer2DModel.patchify_and_embed_omni]]

Source

Patchify for omni mode: multiple images per batch item with noise masks.

Xet Storage Details

Size:
1.99 kB
·
Xet hash:
b3c15849f9b7918df460f86ac63eb1be7c3b937097d75a418f6fdf8194e27c5f

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.