Instructions to use stabilityai/sd-vae-ft-ema with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use stabilityai/sd-vae-ft-ema with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("stabilityai/sd-vae-ft-ema", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Link to a paper/documentation requested
#5
by sahdeV-Vedhas - opened
Is there a paper I can take a look at, to know in detail the exact architecture and training procedure summarized for this particular model?
Specifically, This repo states that "Calling this model a "VAE" is sort of a misnomer - it's an encoder with some very slight KL regularization, and a conditional GAN decoder". Is this true?
Kindly point us to architecture and training process documentation. Sharing a link would help greatly to learn the details, and also when it comes to citations for the information we put up in a publication, blogpost, video tutorial etc. . :) Thank you!