OpenFLUX.1 GGUF Model Card

Quantized versions of ostris/OpenFLUX.1 in GGUF format for use with stable-diffusion.cpp.

At the time of publishing, no ready-made GGUF weights for OpenFlux were available for sd.cpp runtime — so here we are.

Sample generation: "A lovely cat" · seed 440103671 · Q8_0

Available Quantizations

File	Type	Description
`openflux1-v0.1.0-Q8_0.gguf`	Q8_0	Great balance of quality and size ✅ recommended
`openflux1-v0.1.0-Q4_0.gguf`	Q4_0	Smaller size

Quick Start

1. Download the model

# Recommended — Q8_0
wget https://huggingface.co/kostakoff/OpenFLUX.1-GGUF/resolve/main/openflux1-v0.1.0-Q8_0.gguf

# Other quantizations:
# wget https://huggingface.co/kostakoff/OpenFLUX.1-GGUF/resolve/main/openflux1-v0.1.0-Q4_0.gguf

2. Build stable-diffusion.cpp

Requirements: CUDA-capable GPU, CMake ≥ 3.18, CUDA Toolkit

git clone https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp
git submodule init
git submodule update

mkdir build && cd build
cmake .. -DSD_CUDA=ON
cmake --build . --config Release

Version used for conversion and testing:

stable-diffusion.cpp version master-520-d950627, commit d950627

3. Start the server

export CUDA_VISIBLE_DEVICES=0

./stable-diffusion.cpp/build/bin/sd-server \
  -m ./openflux1-v0.1.0-Q8_0.gguf \
  --vae-on-cpu \
  --listen-ip 0.0.0.0 \
  --listen-port 8081 \
  --seed -1

⚠️ The --vae-on-cpu flag is required! The VAE decoder consumes up to 10 GB of VRAM when converting the latent representation to PNG. Offloading VAE to CPU makes it possible to run the model on most consumer GPUs.

4. Generate an image

curl -s http://127.0.0.1:8081/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux",
    "prompt": "A lovely cat<sd_cpp_extra_args>{\"seed\": 440103671}</sd_cpp_extra_args>",
    "n": 1,
    "size": "",
    "response_format": "b64_json"
  }' | jq -r '.data[0].b64_json' | base64 --decode > out.png

Extra parameters are passed via <sd_cpp_extra_args> as a JSON snippet embedded directly in the prompt field.

How the weights were created

Converted from the original openflux1-v0.1.0-fp8.safetensors weights using the built-in sd-cli conversion tool:

# Q8_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/openflux/openflux1-v0.1.0-fp8.safetensors \
  -o ./openflux1-v0.1.0-Q8_0.gguf -v --type q8_0

# Q4_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/openflux/openflux1-v0.1.0-fp8.safetensors \
  -o ./openflux1-v0.1.0-Q4_0.gguf -v --type q4_0

License

This model inherits the license of the original — Apache 2.0

Downloads last month: 30

GGUF

Model size

17B params

Architecture

undefined

Hardware compatibility

4-bit

8-bit

Model tree for kostakoff/OpenFLUX.1-GGUF

Base model

ostris/OpenFLUX.1

Quantized

(1)

this model

Collection including kostakoff/OpenFLUX.1-GGUF

Forks

Collection

My forks, when I need modify something in original model • 4 items • Updated 4 days ago