Instructions to use black-forest-labs/FLUX.1-Kontext-dev with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use black-forest-labs/FLUX.1-Kontext-dev with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-Kontext-dev", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Diffusion Single File
How to use black-forest-labs/FLUX.1-Kontext-dev with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Inference
- Notebooks
- Google Colab
- Kaggle
Wouldn't fit on 4090 so I made it use a 4bit quant
https://gist.github.com/fancellu/2ad4038eb8e871a70ff70f59b7bb4567
Uses this nice quant https://huggingface.co/HighCWu/FLUX.1-Kontext-dev-bnb-hqq-4bit
Takes about 40 seconds
Would like to contribute HQQ to diffusers? <3
What do you mean? This is HQQ, its not mine.
https://github.com/mobiusml/hqq/
All I did is plop in a quantized model, so that it could run on my 4090 (takes about 40 seconds)
I am aware of HQQ. I was asking if you'd be interested in contributing this as a quantization backend to Diffusers. We have a few available:
https://huggingface.co/docs/diffusers/main/en/quantization/overview
You can use hqq with diffusers through pruna oss!
If you think this would be interesting to add in the diffusers docs alongside the other pruna page we can do it!
i got error when i run it on p100 (16gb) kaggle , help me fix it please
The P100 is old old old. 2016. Unfortunately. Its memory management is simply not up to HQQ needs. Even if it could run, it would run sooooo slowly
Pascal Architecture Limitations:
No Tensor Cores: Critical for AI inference acceleration
Compute Capability 6.0: Limited optimization support
Older Memory Interface: HBM2 at lower speeds
Limited Quantization Support: Reduced compatibility with modern methods
It isn't supported by Nvidia on that model.
https://docs.nvidia.com/nim/visual-genai/1.1.1/support-matrix.html