Instructions to use fal/AuraFlow with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use fal/AuraFlow with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("fal/AuraFlow", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
Reload the model each time ?
#20
by Steph83 - opened
Hello ! I use ComfyUI. I have Total VRAM 16376 MB, total RAM 32581 MB
pytorch version: 2.3.0+cu121
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 Ti SUPER : cudaMallocAsync
Using pytorch cross attention
With so much memory, I have a waste of time at every Prompt Queue because :
“Requested to load AutoencoderKL
Loading 1 new model”
Any help ?