How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("drbaph/HiDream-O1-Image-FP8", dtype=torch.bfloat16, device_map="cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]

HiDream-O1-Image β€” FP8 Mixed (ComfyUI)

This is the FP8 mixed-precision quantization of HiDream-O1-Image for use with ComfyUI. By quantizing to 8-bit floats, the model fits comfortably within ~10 GB of VRAM β€” making it accessible on 12 GB GPUs (RTX 3080/4070/4080, etc.) with minimal quality trade-off.

image

Custom ComfyUI Node: Saganaki22/HiDream_O1-ComfyUI

Screenshot 2026-05-10 005045


VRAM Requirements

Precision Approximate VRAM
BF16 17 – 20 GB
FP16 17 – 20 GB
FP8 Mixed (this repo) ~10 GB

This is the recommended variant for GPUs with less than 16 GB VRAM. Tested on 12 GB cards at 2048 Γ— 2048 resolution.

What is FP8 Mixed? Weights are stored in float8_e4m3fn format. Sensitive layers (norms, embeddings, output heads) retain higher precision to preserve stability, hence "mixed." On CUDA-capable GPUs with Hopper or Ada Lovelace architecture (RTX 40xx, H100), FP8 compute is hardware-accelerated. On older GPUs, weights are dequantized on-the-fly β€” still saving VRAM, with a small speed penalty.


Quick Start β€” ComfyUI

1. Install the Custom Node

cd ComfyUI/custom_nodes
git clone https://github.com/Saganaki22/HiDream_O1-ComfyUI
pip install -r HiDream_O1-ComfyUI/requirements.txt

Or install via ComfyUI Manager by searching for HiDream O1.

2. Download the Weights

huggingface-cli download drbaph/HiDream-O1-Image-FP8 \
    --local-dir ComfyUI/models/diffusion_models/HiDream-O1-Image-fp8

3. Load in ComfyUI

Open ComfyUI and use the workflow provided in the custom node repository. Point the model loader to HiDream-O1-Image-fp8.


About HiDream-O1-Image

HiDream-O1-Image is a natively unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) β€” no external VAEs, no disjoint text encoders. It encodes raw pixels, text, and task-specific conditions in a single shared token space, supporting:

  • Text-to-image generation up to 2,048 Γ— 2,048
  • Instruction-based image editing
  • Subject-driven personalization (multi-reference IP)
  • Long-text and multilingual text rendering

At only 9B parameters it matches or exceeds much larger open-source DiTs and leading closed-source models. It debuted at #8 in the Artificial Analysis Text to Image Arena (2026-05-05).


Key Features

  • 🧬 Pixel-Level Unified Transformer β€” end-to-end on raw pixels, no VAE, no disjoint text encoder
  • 🎨 One Model, Many Tasks β€” T2I, editing, personalization, storyboard generation
  • 🧠 Reasoning-Driven Prompt Agent β€” built-in "thinking" agent that resolves layout and rendering before generation
  • πŸ–ΌοΈ Native High Resolution β€” direct synthesis up to 2,048 Γ— 2,048
  • ⚑ 9B Parameters β€” performance parity with models many times larger
  • πŸ’Ύ FP8 Quantized β€” ~half the VRAM of full-precision variants, minimal quality loss

Model Variants

Repo Precision VRAM Inference Steps
drbaph/HiDream-O1-Image-BF16 BF16 17–20 GB 50
drbaph/HiDream-O1-Image-FP16 FP16 17–20 GB 50
drbaph/HiDream-O1-Image-FP8 (this repo) FP8 Mixed ~10 GB 50
HiDream-ai/HiDream-O1-Image Original β€” 50
HiDream-ai/HiDream-O1-Image-Dev Original Dev β€” 28

Benchmark Results (from original model)

GenEval (compositional generation) β€” HiDream-O1-Image scores 0.90 overall at 9B params, second only to the 200B+ Pro variant and ahead of GPT Image 2 (0.89).

DPG-Bench (dense prompt alignment) β€” Overall score 89.83, ranking second behind the Pro variant.

HPSv3 (human preference) β€” Overall score 10.37, outperforming GPT Image 2 (10.21) and Nano Banana 2.0 (10.01).


License

The original HiDream-O1-Image model and code are released under the MIT License. This FP8 quantization inherits the same license.


Links

Downloads last month
2,171
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support