Comfy Use
Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.
It is now available in the ComfyUI nightly version, no official example workflow yet, but I have simple test workflow available in the PR: https://github.com/Comfy-Org/ComfyUI/pull/13817
The biggest quality issues are in the model itself though, we have some workarounds such as the seam smoothing, and with the native implementation you have access to all different samplers etc. so I'm sure we can find better ways to use the model, but still it's going to be limited when it comes to the final quality, at least without further training.
Personally most interesting use so far has been the reference based image generation.
it's faster but i don't think it's better than hidream1- Except for the editing stuff which has come a long way. But what use is that if it's blurry and stitched. Good that it's worked on but the competition works just fine.
@kijai . Just curious. Are you part of the comfy team or a very strong talented supporter? Your names comes up a lot
It's really good with text.... and composition etc.
A little to be desired when it comes to overall quality perhaps (face, hands, details etc)
But was just a first few attempt ;-) Might be possible to tweak it a bit; prompt, sampler, steps etc - or even a refiner 2nd pass (with same model or other model)
(havent tried the ref image part much yet, but that looks really good as well)
(and the spelling in the title is all my fault.. ComfyUI, not Comfy-UI.. prompted a bit too fast.. haha.)
everything photo is blurry and low detail. Reference images seem to work fine for composition and editing. but that doesn't help a weak result. And it doesn't work better than flux.2 or qwen edit.
Also fairly samy images from the same prompt, even the full model. edit: that's down to the workflow not really having random seeds, only adding noise in the sampler.
Anatomy seems to be less messy than in flux.2 models but that's a low bar that only Ernie could beat (with a shovel)... tbh i come back to qwen edit as reliable workhorse, in all but the resolution it's nearly as good but just less fiddly than others. All the same prompts after the first one: hidream o1 full:
it's not shiny as usual, it kinda fuzzy even prompted a high quality photo...
same prompt for all following:
Hidream o1, full, mxfp8 (wf says it's higher quality). if you make it small it looks ok, tiny bit closer and everything is fuzzy (faces) Also kinda boring the guys look almost cloned (that's the reason i changed the prompt to "different guys":
Flux.2 klein9b. Amazing visuals, can't count arms (give the right guy the benefit of the doubt that he is jumping):
klein4b for a change the better:
hidream-i1 Full (old) hidream. i wouldn't say it is better than O1 but less fuzzy :
and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
Some of my outputs I've liked:
Some observations, (mostly using reference images):
- Base model way better, but requires using the seam fix workaround for the tiling issue
- You can use higher res than the default
- deis or res_multistep with beta has worked nicely for me, but too many options here to choose best
Also got some good results with res_multistep. Maybe a good candidate ;-)
Deis works really well, even got a bit of skin blemishes and details (that was in the prompt)
You can also try adding some of the dev distill as a LoRA, not too much or it will burn it: https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_dev_lora_rank_64_bf16_pruned_v1.safetensors
You can also try adding some of the dev distill as a LoRA, not too much or it will burn
yes that helped a bit as well
tbc the images as such are ok, (but my three guys were still clones). without the fuzziness. Far enough out (or small enough), like the size we see here, and apparently beyond photorealism.
But it is meant to do 2048x2048. It's promising but not great. We'll see what people do with it. And thanks for working on it. Now that Qwen Image seems to go closed (and small) , alternatives are good. Except ernie...
Also found that using the Gemma 4 text generate in comfy, and feeding your prompt with the instruct from HiDream, vastly improved the output.
I used the prompt instruction here https://github.com/HiDream-ai/HiDream-O1-Image/blob/main/prompt_agent.py (but i translated it to english)
It makes a json prompt that the model seems to like a lot ;-)
i opened the lady 2 posts up in full size. The skin is pure blur. The back and white old guy further up, even in small size, the hairs looks ok but the skin is completely blurry.
Are they just overselling the resolutions it can do? Maybe someone makes a anti blur or skin lora. That seems to be by far the biggest problem. At least in photorealistic images. And it's a techncial problem of the model. You can prompt as much about no blur, sharpness, skin details or no dof, or change samplers, and it still does it.
The black and white guy is a good example of prompting the hell out of it (tons of prompted skin details, hair and face details to try to fix the problem , and you end up with these typical 100 year olds, even if you prompt someone age 40. It's the only way the poor model can cramp all these prompted details into a face. but the skin is STILL blurry.
With enhanced prompts like this btw i find it a double edged sword anyway. Not buying this new meta about short story sized prompts, that started with Z-Image. because it's very hard to stop it from changing too much. And even Z-image does just fine with a simpler prompt. It just does always the same with it. If you want a randomized/ fancied up version of a core idea, a long, flowery prompt is great. But for something precise it's often more annoying.
And if a model has problems like F2, Ernie with anatomy or HidreamO1 with blur, even a long prompt doesn't change the fundamental flaw. Hence the b/w 100 year old guy.
This model has its strength and weaknesses ... as any other model i guess.
But its open source, so community will evolve on it, if they want. Make fine tuned models, loras and what not ;-)
So the most important part is that its open source
@RuneXX I noticed you had Shift adjustment in one of your workflows, and realized I had a mistake in the initial ModelNoiseScale node that had two buggy behaviours with the shift adjustments:
- If the Shift node was after the ModelNoiseScale, it reset the noise scale to the model default (8.0) making the node adjustment do nothing
- ModelNoiseScale was after the shift, it reseted the shift to model default
PR has been merged now that fixes that and it should work both ways.
Yes i was just experimenting. trying the shift to see how things improved or got worse ;-)
will try again
The model has some serious strength (composition, text, and more .. it looks really "artistic" sometimes).
It does lack a bit in the finer grain details, skin etc, but that might come with community iterations and improvements
I dont know if its just me, but i really like some of the outputs, reminds me of the days I did black and white photography. When you do close up photos, not everything is in focus.
Makes it look more real to me.. . But i do see why some say the skin is plastic etc (but that been said about ai images since sdxl, flux etc etc)
To me it looks abit more like something you'd find in a photography art gallery, while Z-image looks more like a magazine photo.. or something like that ;-)
(images below are stock comfyui workflow with the fp16 full model and a small dash of Kijai's lora (0.3), with res_multistep sampler... if i remember correctly)
Did they already release an updated model btw?
https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev-2604
From the "sales pitch", it sounds like it depends on the prompt refiner, but i guess thats also true for the previous ones
The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.
The new model is aimed to improve pose following when using something like openpose rig as one of the references, otherwise initial impression is that it just seems... worse in details, even blurrier etc... and it's dev only. I could be doing something wrong still, didn't do any extensive tests yet. Definitely does follow the pose more.
The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.
Hey kijai,
I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?
I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!
Heres my nodes if you want to take a look 🤷♂️ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.
https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main
and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
ernie is really good with prompted skin detail, but yes, the ghost limbs are really bad and i initially thought resolution dependent which is not the case, they are just breaking from time to time. lets also not talk about the training data bias... but ernie is also mostly uncensored or can at least display normal nudity (no hardcore stuff) whereas hidream o1 has never seen a nipple... might not be important for a lot of people but for creating character images, it is nice if the base model can do stuff like that...
Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:
The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.
Hey kijai,
I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!
Heres my nodes if you want to take a look 🤷♂️ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.
https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main
Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.
Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.
I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is
I have a set of detailing prompt and ran it through with the full model, it gives some variations but there is still a bit of smoothing happening in the last few steps of the image generation, it also needs to get a bit more variety in its results but we can prompt those as well for the moment. So far each model I tested had their goto face and it always helped to prompt in some ethnicity and more detail. Unlike Chroma or Flux (as well as older models) which is limited to a certain prompt length, newer models can be told a lot of detail in prompt.
Some more examples:



Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.
I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is
It smooths everything out on the last (low) sigmas, if you end the schedule early there's bit more detail, but also the same patch grid artifacts as with the base model. It looks to me the dev model has been trained (either on purpose or by side effect) to smooth out the grid artifacts, which ends up also losing ton of normal detail. Just a theory, don't know anything for sure, I have tried various methods trying to get more quality out of it and really only worthwhile approach seems some sort of hybrid using the base model and the dev as a LoRA at lower strength.
Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.
Works great with WAN2GP. They tend to have day 0 support for weird stuff more often than ComfyUI these days.
It's more memory efficient than comfy too, so you don't have to sacrifice on quality tradeoffs
Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:
https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_image_dev_2604_lora_avg_rankg_224_bf16.safetensors
Works nicely
with that lora ;-)





















