Instructions to use aniketppanchal/flux.1-dev-nf4-pkg with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use aniketppanchal/flux.1-dev-nf4-pkg with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("aniketppanchal/flux.1-dev-nf4-pkg", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("aniketppanchal/flux.1-dev-nf4-pkg", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]This repository provides quantized weights of the FLUX.1 [dev],
converted using BitsAndBytes in NF4 format. This enables GPU inference with reduced VRAM requirements, making it
accessible even on the Google Colab free tier or on GPUs with 8GB VRAM.
The FLUX.1 [dev] model consists of three main components:
- Text EncodersβCLIP and T5
- Flux Transformer
- VAE
In this repository, only the T5 encoder and the Flux Transformer are quantized. The CLIP encoder and VAE remain in their original precision but are included to ensure a fully functional inference pipeline.
Usage
pip install bitsandbytes==0.48.1 diffusers==0.35.1 peft==0.17.1 protobuf==5.29.5 sentencepiece==0.2.1 transformers==4.56.1
Full Pipeline Mode (β 14.8 GB VRAM)
import torch
from diffusers import FluxPipeline
ckpt_4bit_id = "aniketppanchal/flux.1-dev-nf4-pkg"
prompt = "A cat holding a sign that says hello world"
height = 1024
width = 1024
pipeline = FluxPipeline.from_pretrained(
ckpt_4bit_id,
torch_dtype=torch.float16,
device_map="cuda",
)
image = pipeline(
prompt=prompt,
height=height,
width=width,
num_inference_steps=28,
guidance_scale=3.5,
max_sequence_length=512,
).images[0]
image.save("output.png")
Split Pipeline Mode (β 7.7 GB VRAM)
import gc
import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from transformers import T5EncoderModel
ckpt_4bit_id = "aniketppanchal/flux.1-dev-nf4-pkg"
prompt = "A cat holding a sign that says hello world"
height = 1024
width = 1024
# ----------Encode Prompt Embeddings----------
text_encoder_2 = T5EncoderModel.from_pretrained(
ckpt_4bit_id,
subfolder="text_encoder_2",
torch_dtype=torch.float16,
device_map="cuda",
)
pipeline = FluxPipeline.from_pretrained(
ckpt_4bit_id,
text_encoder_2=text_encoder_2,
transformer=None,
vae=None,
torch_dtype=torch.float16,
device_map="cuda",
)
with torch.no_grad():
prompt_embeds, pooled_prompt_embeds, _ = pipeline.encode_prompt(
prompt=prompt,
max_sequence_length=512,
)
del text_encoder_2, pipeline
gc.collect()
torch.cuda.empty_cache()
# ----------Generate Diffusion Latents----------
transformer = FluxTransformer2DModel.from_pretrained(
ckpt_4bit_id,
subfolder="transformer",
torch_dtype=torch.float16,
device_map="cuda",
)
pipeline = FluxPipeline.from_pretrained(
ckpt_4bit_id,
text_encoder=None,
text_encoder_2=None,
tokenizer=None,
tokenizer_2=None,
transformer=transformer,
vae=None,
torch_dtype=torch.float16,
device_map="cuda",
)
packed_latents = pipeline(
height=height,
width=width,
num_inference_steps=28,
guidance_scale=3.5,
prompt_embeds=prompt_embeds,
pooled_prompt_embeds=pooled_prompt_embeds,
output_type="latent",
max_sequence_length=512,
).images
del prompt_embeds, pooled_prompt_embeds, transformer, pipeline
gc.collect()
torch.cuda.empty_cache()
# ----------Decode Latents to Image----------
pipeline = FluxPipeline.from_pretrained(
ckpt_4bit_id,
text_encoder=None,
text_encoder_2=None,
tokenizer=None,
tokenizer_2=None,
transformer=None,
torch_dtype=torch.float16,
device_map="cuda",
)
unpacked_latents = (
pipeline._unpack_latents(
packed_latents,
height=height,
width=width,
vae_scale_factor=pipeline.vae_scale_factor,
)
/ pipeline.vae.config.scaling_factor
+ pipeline.vae.config.shift_factor
)
with torch.no_grad():
image_tensor = pipeline.vae.decode(unpacked_latents, return_dict=False)[0]
image = pipeline.image_processor.postprocess(image_tensor)[0]
image.save("output.png")
del packed_latents, unpacked_latents, image_tensor, pipeline
gc.collect()
torch.cuda.empty_cache()
License
This repository is released under the FLUX-1 Dev Non-Commercial License. The included LICENSE.md file corresponds
to the frozen state of the original repository as of 3rd November 2025. For the latest version, see
the FLUX.1 [dev] License.
- Downloads last month
- 23
Model tree for aniketppanchal/flux.1-dev-nf4-pkg
Base model
black-forest-labs/FLUX.1-dev![FLUX.1 [dev] Grid](/aniketppanchal/flux.1-dev-nf4-pkg/resolve/main/dev_grid.png)