HiDream-O1-Image-Dev โ FP8 Mixed (ComfyUI)
This is the FP8 mixed-precision quantization of HiDream-ai/HiDream-O1-Image-Dev โ the distilled variant of HiDream-O1-Image โ for use with ComfyUI. This is the most accessible variant: only ~10 GB VRAM and just 28 steps, making it the fastest way to run HiDream O1 locally.
Custom ComfyUI Node: Saganaki22/HiDream_O1-ComfyUI
Dev vs Full โ Key Differences
| Full Model | Dev Model (this repo) | |
|---|---|---|
| Inference Steps | 50 | 28 |
| Guidance Scale (CFG) | 5.0 | 0.0 (disabled) |
| Shift | 3.0 | 1.0 |
| Scheduler | FlowUniPCMultistepScheduler | FlashFlowMatchEulerDiscreteScheduler |
| Speed | Slower, more detail | ~2ร faster |
The Dev model uses a custom Euler scheduler with built-in noise scaling tuned for fewer steps. CFG is disabled โ negative prompts have no effect in Dev mode.
VRAM Requirements
| Precision | Approximate VRAM |
|---|---|
| BF16 | 17 โ 20 GB |
| FP16 | 17 โ 20 GB |
| FP8 Mixed (this repo) | ~10 GB |
This is the recommended variant for GPUs with less than 16 GB VRAM. Combined with the Dev model's 28-step schedule, it is the lowest-cost way to run HiDream O1 โ roughly 2ร faster and half the VRAM of the full BF16 model.
What is FP8 Mixed? Weights are stored in
float8_e4m3fn. Sensitive layers (norms, embeddings, output heads) retain higher precision for stability. On RTX 40xx / H100 (Hopper/Ada), FP8 compute is hardware-accelerated. On older GPUs, weights dequantize on-the-fly โ still saving VRAM, with a small speed penalty. Do not setconfig.jsondtype tofloat8_e4m3fn; keep it asbfloat16โ the node detects FP8 from the safetensors tensors directly.
Quick Start โ ComfyUI
1. Install the Custom Node
cd ComfyUI/custom_nodes
git clone https://github.com/Saganaki22/HiDream_O1-ComfyUI.git
cd HiDream_O1-ComfyUI
python -m pip install -r requirements.txt
Or search for HiDream O1 in ComfyUI Manager.
Suggested transformers version: 4.57.1 โ 5.3 (newer versions may break compatibility).
2. Download the Weights
Download the entire model folder (all files, not just the safetensors) and place it in ComfyUI/models/diffusion_models/:
huggingface-cli download drbaph/HiDream-O1-Image-Dev-FP8 \
--local-dir ComfyUI/models/diffusion_models/HiDream-O1-Image-Dev-fp8
The folder must contain the full Hugging Face support files alongside the weights:
config.json, chat_template.json, generation_config.json, preprocessor_config.json, tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, model.safetensors
3. Load in ComfyUI
Use the workflow provided in the custom node repository. The loader will detect dev in the folder name and automatically apply Dev settings (28 steps, no CFG, Euler scheduler). Point the model loader to HiDream-O1-Image-Dev-fp8.
For the fastest inference on supported hardware, set precision to fp8_e4m3fn_fast in the model loader node.
About HiDream-O1-Image
HiDream-O1-Image is a natively unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) โ no external VAEs, no disjoint text encoders. It encodes raw pixels, text, and task-specific conditions in a single shared token space, supporting:
- Text-to-image generation up to 2,048 ร 2,048
- Instruction-based image editing
- Subject-driven personalization (multi-reference IP)
- Long-text and multilingual text rendering
At only 9B parameters it matches or exceeds much larger open-source DiTs and leading closed-source models. It debuted at #8 in the Artificial Analysis Text to Image Arena (2026-05-05).
Key Features
- ๐งฌ Pixel-Level Unified Transformer โ end-to-end on raw pixels, no VAE, no disjoint text encoder
- ๐จ One Model, Many Tasks โ T2I, editing, personalization, storyboard generation
- โก 28-Step Distilled Dev โ ~2ร faster than the full model with minimal quality trade-off
- ๐พ FP8 Quantized โ ~half the VRAM of full-precision variants
- ๐ผ๏ธ Native High Resolution โ direct synthesis up to 2,048 ร 2,048
All Model Variants
Full Model
| Repo | Precision | VRAM | Steps |
|---|---|---|---|
| drbaph/HiDream-O1-Image-BF16 | BF16 | 17โ20 GB | 50 |
| drbaph/HiDream-O1-Image-FP16 | FP16 | 17โ20 GB | 50 |
| drbaph/HiDream-O1-Image-FP8 | FP8 Mixed | ~10 GB | 50 |
Dev Model (distilled, faster)
| Repo | Precision | VRAM | Steps |
|---|---|---|---|
| drbaph/HiDream-O1-Image-Dev-BF16 | BF16 | 17โ20 GB | 28 |
| drbaph/HiDream-O1-Image-Dev-FP16 | FP16 | 17โ20 GB | 28 |
| drbaph/HiDream-O1-Image-Dev-FP8 (this repo) | FP8 Mixed | ~10 GB | 28 |
License
The original HiDream-O1-Image model and code are released under the MIT License. This FP8 quantization inherits the same license.
Links
- ๐ Original Dev model: HiDream-ai/HiDream-O1-Image-Dev
- ๐ Original Full model: HiDream-ai/HiDream-O1-Image
- ๐ง ComfyUI node: Saganaki22/HiDream_O1-ComfyUI
- ๐ Technical report: HiDream-O1-Image.pdf
- ๐ค Online demo: HiDream-O1-Image-Dev Space
- Downloads last month
- 12

