Buckets:
| # AutoencoderKLMagvit | |
| The 3D variational autoencoder (VAE) model with KL loss used in [EasyAnimate](https://github.com/aigc-apps/EasyAnimate) was introduced by Alibaba PAI. | |
| The model can be loaded with the following code snippet. | |
| ```python | |
| from diffusers import AutoencoderKLMagvit | |
| vae = AutoencoderKLMagvit.from_pretrained("alibaba-pai/EasyAnimateV5.1-12b-zh", subfolder="vae", torch_dtype=torch.float16).to("cuda") | |
| ``` | |
| ## AutoencoderKLMagvit[[diffusers.AutoencoderKLMagvit]] | |
| #### diffusers.AutoencoderKLMagvit[[diffusers.AutoencoderKLMagvit]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/models/autoencoders/autoencoder_kl_magvit.py#L665) | |
| A VAE model with KL loss for encoding images into latents and decoding latent representations into images. This | |
| model is used in [EasyAnimate](https://huggingface.co/papers/2405.18991). | |
| This model inherits from [ModelMixin](/docs/diffusers/pr_12652/en/api/models/overview#diffusers.ModelMixin). Check the superclass documentation for it's generic methods implemented | |
| for all models (such as downloading or saving). | |
| wrapperdiffusers.AutoencoderKLMagvit.decodehttps://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/utils/accelerate_utils.py#L43[{"name": "*args", "val": ""}, {"name": "**kwargs", "val": ""}] | |
| #### wrapper[[diffusers.AutoencoderKLMagvit.encode]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/utils/accelerate_utils.py#L43) | |
| #### enable_tiling[[diffusers.AutoencoderKLMagvit.enable_tiling]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/models/autoencoders/autoencoder_kl_magvit.py#L771) | |
| Enable tiled VAE decoding. When this option is enabled, the VAE will split the input tensor into tiles to | |
| compute decoding and encoding in several steps. This is useful for saving a large amount of memory and to allow | |
| processing larger images. | |
| **Parameters:** | |
| tile_sample_min_height (`int`, *optional*) : The minimum height required for a sample to be separated into tiles across the height dimension. | |
| tile_sample_min_width (`int`, *optional*) : The minimum width required for a sample to be separated into tiles across the width dimension. | |
| tile_sample_stride_height (`int`, *optional*) : The minimum amount of overlap between two consecutive vertical tiles. This is to ensure that there are no tiling artifacts produced across the height dimension. | |
| tile_sample_stride_width (`int`, *optional*) : The stride between two consecutive horizontal tiles. This is to ensure that there are no tiling artifacts produced across the width dimension. | |
| #### forward[[diffusers.AutoencoderKLMagvit.forward]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/models/autoencoders/autoencoder_kl_magvit.py#L1046) | |
| **Parameters:** | |
| sample (`torch.Tensor`) : Input sample. | |
| sample_posterior (`bool`, *optional*, defaults to `False`) : Whether to sample from the posterior. | |
| return_dict (`bool`, *optional*, defaults to `True`) : Whether or not to return a `DecoderOutput` instead of a plain tuple. | |
| ## AutoencoderKLOutput[[diffusers.models.modeling_outputs.AutoencoderKLOutput]] | |
| #### diffusers.models.modeling_outputs.AutoencoderKLOutput[[diffusers.models.modeling_outputs.AutoencoderKLOutput]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/models/modeling_outputs.py#L7) | |
| Output of AutoencoderKL encoding method. | |
| **Parameters:** | |
| latent_dist (`DiagonalGaussianDistribution`) : Encoded outputs of `Encoder` represented as the mean and logvar of `DiagonalGaussianDistribution`. `DiagonalGaussianDistribution` allows for sampling latents from the distribution. | |
| ## DecoderOutput[[diffusers.models.autoencoders.vae.DecoderOutput]] | |
| #### diffusers.models.autoencoders.vae.DecoderOutput[[diffusers.models.autoencoders.vae.DecoderOutput]] | |
| [Source](https://github.com/huggingface/diffusers/blob/vr_12652/src/diffusers/models/autoencoders/vae.py#L46) | |
| Output of decoding method. | |
| **Parameters:** | |
| sample (`torch.Tensor` of shape `(batch_size, num_channels, height, width)`) : The decoded output sample from the last layer of the model. | |
Xet Storage Details
- Size:
- 4.14 kB
- Xet hash:
- 6aadce447ed716e8c3402b1e92451c0fe9280d7884ae14f2fdd78d5bbd359895
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.