Buckets:

hf-doc-build
/

doc

Files

xet

hf-doc-build/doc / diffusers /main /en /api /models /bria_transformer.md

HuggingFaceDocBuilder

about 7 hours ago

preview code

download

raw

3.93 kB

	# BriaTransformer2DModel

	A modified flux Transformer model from [Bria](https://huggingface.co/briaai/BRIA-3.2)

	## BriaTransformer2DModel[[diffusers.BriaTransformer2DModel]]

	#### diffusers.BriaTransformer2DModel[[diffusers.BriaTransformer2DModel]]

	[Source](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_bria.py#L506)

	The Transformer model introduced in Flux. Based on FluxPipeline with several changes:
	- no pooled embeddings
	- We use zero padding for prompts
	- No guidance embedding since this is not a distilled version
	Reference: https://blackforestlabs.ai/announcing-black-forest-labs/

	forwarddiffusers.BriaTransformer2DModel.forwardhttps://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_bria.py#L584[{"name": "hidden_states", "val": ": Tensor"}, {"name": "encoder_hidden_states", "val": ": Tensor = None"}, {"name": "pooled_projections", "val": ": Tensor = None"}, {"name": "timestep", "val": ": LongTensor = None"}, {"name": "img_ids", "val": ": Tensor = None"}, {"name": "txt_ids", "val": ": Tensor = None"}, {"name": "guidance", "val": ": Tensor = None"}, {"name": "attention_kwargs", "val": ": dict[str, typing.Any] \| None = None"}, {"name": "return_dict", "val": ": bool = True"}, {"name": "controlnet_block_samples", "val": " = None"}, {"name": "controlnet_single_block_samples", "val": " = None"}]- hidden_states (`torch.FloatTensor` of shape `(batch size, channel, height, width)`) --
	Input `hidden_states`.
	- encoder_hidden_states (`torch.FloatTensor` of shape `(batch size, sequence_len, embed_dims)`) --
	Conditional embeddings (embeddings computed from the input conditions such as prompts) to use.
	- pooled_projections (`torch.FloatTensor` of shape `(batch_size, projection_dim)`) -- Embeddings projected
	from the embeddings of input conditions.
	- timestep ( `torch.LongTensor`) --
	Used to indicate denoising step.
	- block_controlnet_hidden_states -- (`list` of `torch.Tensor`):
	A list of tensors that if specified are added to the residuals of transformer blocks.
	- attention_kwargs (`dict`, optional) --
	A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
	`self.processor` in
	[diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
	- return_dict (`bool`, optional, defaults to `True`) --
	Whether or not to return a `~models.transformer_2d.Transformer2DModelOutput` instead of a plain
	tuple.0If `return_dict` is True, an `~models.transformer_2d.Transformer2DModelOutput` is returned, otherwise a
	`tuple` where the first element is the sample tensor.

	The [BriaTransformer2DModel](/docs/diffusers/main/en/api/models/bria_transformer#diffusers.BriaTransformer2DModel) forward method.

	Parameters:

	patch_size (`int`) : Patch size to turn the input data into small patches.

	in_channels (`int`, optional, defaults to 16) : The number of channels in the input.

	num_layers (`int`, optional, defaults to 18) : The number of layers of MMDiT blocks to use.

	num_single_layers (`int`, optional, defaults to 18) : The number of layers of single DiT blocks to use.

	attention_head_dim (`int`, optional, defaults to 64) : The number of channels in each head.

	num_attention_heads (`int`, optional, defaults to 18) : The number of heads to use for multi-head attention.

	joint_attention_dim (`int`, optional) : The number of `encoder_hidden_states` dimensions to use.

	pooled_projection_dim (`int`) : Number of dimensions to use when projecting the `pooled_projections`.

	guidance_embeds (`bool`, defaults to False) : Whether to use guidance embeddings.

	Returns:

	If `return_dict` is True, an `~models.transformer_2d.Transformer2DModelOutput` is returned, otherwise a
	`tuple` where the first element is the sample tensor.

Xet Storage Details

Size:: 3.93 kB
Xet hash:: 1b930c7204c4a56c8abac2aea930d6c0473b3bc45cdc3b17365071c11009adc9

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.