Buckets:
| # BriaTransformer2DModel | |
| A modified flux Transformer model from [Bria](https://huggingface.co/briaai/BRIA-3.2) | |
| ## BriaTransformer2DModel[[diffusers.BriaTransformer2DModel]] | |
| #### diffusers.BriaTransformer2DModel[[diffusers.BriaTransformer2DModel]] | |
| [Source](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_bria.py#L506) | |
| The Transformer model introduced in Flux. Based on FluxPipeline with several changes: | |
| - no pooled embeddings | |
| - We use zero padding for prompts | |
| - No guidance embedding since this is not a distilled version | |
| Reference: https://blackforestlabs.ai/announcing-black-forest-labs/ | |
| forwarddiffusers.BriaTransformer2DModel.forwardhttps://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_bria.py#L584[{"name": "hidden_states", "val": ": Tensor"}, {"name": "encoder_hidden_states", "val": ": Tensor = None"}, {"name": "pooled_projections", "val": ": Tensor = None"}, {"name": "timestep", "val": ": LongTensor = None"}, {"name": "img_ids", "val": ": Tensor = None"}, {"name": "txt_ids", "val": ": Tensor = None"}, {"name": "guidance", "val": ": Tensor = None"}, {"name": "attention_kwargs", "val": ": dict[str, typing.Any] | None = None"}, {"name": "return_dict", "val": ": bool = True"}, {"name": "controlnet_block_samples", "val": " = None"}, {"name": "controlnet_single_block_samples", "val": " = None"}]- **hidden_states** (`torch.FloatTensor` of shape `(batch size, channel, height, width)`) -- | |
| Input `hidden_states`. | |
| - **encoder_hidden_states** (`torch.FloatTensor` of shape `(batch size, sequence_len, embed_dims)`) -- | |
| Conditional embeddings (embeddings computed from the input conditions such as prompts) to use. | |
| - **pooled_projections** (`torch.FloatTensor` of shape `(batch_size, projection_dim)`) -- Embeddings projected | |
| from the embeddings of input conditions. | |
| - **timestep** ( `torch.LongTensor`) -- | |
| Used to indicate denoising step. | |
| - **block_controlnet_hidden_states** -- (`list` of `torch.Tensor`): | |
| A list of tensors that if specified are added to the residuals of transformer blocks. | |
| - **attention_kwargs** (`dict`, *optional*) -- | |
| A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under | |
| `self.processor` in | |
| [diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py). | |
| - **return_dict** (`bool`, *optional*, defaults to `True`) -- | |
| Whether or not to return a `~models.transformer_2d.Transformer2DModelOutput` instead of a plain | |
| tuple.0If `return_dict` is True, an `~models.transformer_2d.Transformer2DModelOutput` is returned, otherwise a | |
| `tuple` where the first element is the sample tensor. | |
| The [BriaTransformer2DModel](/docs/diffusers/main/en/api/models/bria_transformer#diffusers.BriaTransformer2DModel) forward method. | |
| **Parameters:** | |
| patch_size (`int`) : Patch size to turn the input data into small patches. | |
| in_channels (`int`, *optional*, defaults to 16) : The number of channels in the input. | |
| num_layers (`int`, *optional*, defaults to 18) : The number of layers of MMDiT blocks to use. | |
| num_single_layers (`int`, *optional*, defaults to 18) : The number of layers of single DiT blocks to use. | |
| attention_head_dim (`int`, *optional*, defaults to 64) : The number of channels in each head. | |
| num_attention_heads (`int`, *optional*, defaults to 18) : The number of heads to use for multi-head attention. | |
| joint_attention_dim (`int`, *optional*) : The number of `encoder_hidden_states` dimensions to use. | |
| pooled_projection_dim (`int`) : Number of dimensions to use when projecting the `pooled_projections`. | |
| guidance_embeds (`bool`, defaults to False) : Whether to use guidance embeddings. | |
| **Returns:** | |
| If `return_dict` is True, an `~models.transformer_2d.Transformer2DModelOutput` is returned, otherwise a | |
| `tuple` where the first element is the sample tensor. | |
Xet Storage Details
- Size:
- 3.93 kB
- Xet hash:
- 1b930c7204c4a56c8abac2aea930d6c0473b3bc45cdc3b17365071c11009adc9
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.