HiDream-O1-Image-Dev — BF16 (ComfyUI)

This is the BF16 conversion of HiDream-ai/HiDream-O1-Image-Dev — the distilled variant of HiDream-O1-Image — for use with ComfyUI. The Dev model runs in just 28 steps (half the steps of the full model) making it significantly faster while retaining strong output quality.

image

Custom ComfyUI Node: Saganaki22/HiDream_O1-ComfyUI

Screenshot 2026-05-10 005045


Dev vs Full — Key Differences

Full Model Dev Model (this repo)
Inference Steps 50 28
Guidance Scale (CFG) 5.0 0.0 (disabled)
Shift 3.0 1.0
Scheduler FlowUniPCMultistepScheduler FlashFlowMatchEulerDiscreteScheduler
Speed Slower, more detail ~2× faster

The Dev model uses a custom Euler scheduler with built-in noise scaling tuned for fewer steps. CFG is disabled — negative prompts have no effect in Dev mode.


VRAM Requirements

Precision Approximate VRAM
BF16 (this repo) 17 – 20 GB
FP16 17 – 20 GB
FP8 Mixed ~10 GB

Quick Start — ComfyUI

1. Install the Custom Node

cd ComfyUI/custom_nodes
git clone https://github.com/Saganaki22/HiDream_O1-ComfyUI.git
cd HiDream_O1-ComfyUI
python -m pip install -r requirements.txt

Or search for HiDream O1 in ComfyUI Manager.

Suggested transformers version: 4.57.1 – 5.3 (newer versions may break compatibility).

2. Download the Weights

Download the entire model folder (all files, not just the safetensors) and place it in ComfyUI/models/diffusion_models/:

huggingface-cli download drbaph/HiDream-O1-Image-Dev-BF16 \
    --local-dir ComfyUI/models/diffusion_models/HiDream-O1-Image-Dev-bf16

The folder must contain the full Hugging Face support files alongside the weights: config.json, chat_template.json, generation_config.json, preprocessor_config.json, tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, model.safetensors

3. Load in ComfyUI

Use the workflow provided in the custom node repository. The loader will detect dev in the folder name and automatically apply Dev settings (28 steps, no CFG, Euler scheduler). Point the model loader to HiDream-O1-Image-Dev-bf16.


About HiDream-O1-Image

HiDream-O1-Image is a natively unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) — no external VAEs, no disjoint text encoders. It encodes raw pixels, text, and task-specific conditions in a single shared token space, supporting:

  • Text-to-image generation up to 2,048 × 2,048
  • Instruction-based image editing
  • Subject-driven personalization (multi-reference IP)
  • Long-text and multilingual text rendering

At only 9B parameters it matches or exceeds much larger open-source DiTs and leading closed-source models. It debuted at #8 in the Artificial Analysis Text to Image Arena (2026-05-05).


Key Features

  • 🧬 Pixel-Level Unified Transformer — end-to-end on raw pixels, no VAE, no disjoint text encoder
  • 🎨 One Model, Many Tasks — T2I, editing, personalization, storyboard generation
  • 28-Step Distilled Dev — ~2× faster than the full model with minimal quality trade-off
  • 🖼️ Native High Resolution — direct synthesis up to 2,048 × 2,048

All Model Variants

Full Model

Repo Precision VRAM Steps
drbaph/HiDream-O1-Image-BF16 BF16 17–20 GB 50
drbaph/HiDream-O1-Image-FP16 FP16 17–20 GB 50
drbaph/HiDream-O1-Image-FP8 FP8 Mixed ~10 GB 50

Dev Model (distilled, faster)

Repo Precision VRAM Steps
drbaph/HiDream-O1-Image-Dev-BF16 (this repo) BF16 17–20 GB 28
drbaph/HiDream-O1-Image-Dev-FP16 FP16 17–20 GB 28
drbaph/HiDream-O1-Image-Dev-FP8 FP8 Mixed ~10 GB 28

License

The original HiDream-O1-Image model and code are released under the MIT License. This BF16 conversion inherits the same license.


Links

Downloads last month
497
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support