fal
/

FLUX.2-Tiny-AutoEncoder

Model card Files Files and versions

FLUX.2-Tiny-AutoEncoder / README.md

dn6's picture

dn6 HF Staff

Upload README.md with huggingface_hub

f0dc16f verified 15 days ago

|

2.5 kB

	---
	library_name: diffusers
	license: apache-2.0
	datasets:
	- laion/relaion400m
	base_model:
	- black-forest-labs/FLUX.2-dev
	tags:
	- tae
	- taef2
	---

	# About

	Tiny AutoEncoder trained on the latent space of [black-forest-labs/FLUX.2-dev](https://huggingface.co/black-forest-labs/FLUX.2-dev)'s autoencoder. Works to convert between latent and image space up to 20x faster and in 28x fewer parameters at the expense of a small amount of quality.

	Code for this model is available [here](https://huggingface.co/fal/FLUX.2-Tiny-AutoEncoder/blob/main/flux2_tiny_autoencoder.py).

	# Round-Trip Comparisons

	\| Source \| Image \|
	\| ------ \| ----- \|
	\| https://www.pexels.com/photo/mirror-lying-on-open-book-11495792/ \| ![compare_autoencoders_1](https://cdn-uploads.huggingface.co/production/uploads/64429aaf7feb866811b12f73/u7ZnjY8FAwu09-iyEC_um.png) \|
	\| https://www.pexels.com/photo/brown-hummingbird-selective-focus-photography-1133957/ \| ![compare_autoencoders_2](https://cdn-uploads.huggingface.co/production/uploads/64429aaf7feb866811b12f73/ZzvJu3VfrzlvZ7bDDASog.png) \|
	\| https://www.pexels.com/photo/person-with-body-painting-1209843/ \| ![compare_autoencoders_3](https://cdn-uploads.huggingface.co/production/uploads/64429aaf7feb866811b12f73/B56LPhLYiGT0ffnBVIRbP.png) \|

	# Example Usage

	```py
	import torch
	import torchvision.transforms.functional as F

	from PIL import Image
	from flux2_tiny_autoencoder import Flux2TinyAutoEncoder

	device = torch.device("cuda")
	tiny_vae = Flux2TinyAutoEncoder.from_pretrained(
	"fal/FLUX.2-Tiny-AutoEncoder",
	).to(device=device, dtype=torch.bfloat16)

	pil_image = Image.open("/path/to/image.png")
	image_tensor = F.to_tensor(pil_image)
	image_tensor = image_tensor.unsqueeze(0) * 2.0 - 1.0
	image_tensor = image_tensor.to(device, dtype=tiny_vae.dtype)

	with torch.inference_mode():
	latents = tiny_vae.encode(image_tensor, return_dict=False)
	recon = tiny_vae.decode(latents, return_dict=False)
	recon = recon.squeeze(0).clamp(-1, 1) / 2.0 + 0.5
	recon = recon.float().detach().cpu()

	recon_image = F.to_pil_image(recon)
	recon_image.save("reconstituted.png")
	```

	## Use with Diffusers 🧨

	```py
	import torch
	from diffusers import AutoModel, Flux2Pipeline

	device = torch.device("cuda")
	tiny_vae = AutoModel.from_pretrained(
	"fal/FLUX.2-Tiny-AutoEncoder", trust_remote_code=True, torch_dtype=torch.bfloat16
	).to(device)

	pipe = Flux2Pipeline.from_pretrained(
	"black-forest-labs/FLUX.2-dev", vae=tiny_vae, torch_dtype=torch.bfloat16
	).to(device)
	```