Comfy Use

#5
by Ripcurlsurf - opened

Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.

Comfy Org org

It is now available in the ComfyUI nightly version, no official example workflow yet, but I have simple test workflow available in the PR: https://github.com/Comfy-Org/ComfyUI/pull/13817

The biggest quality issues are in the model itself though, we have some workarounds such as the seam smoothing, and with the native implementation you have access to all different samplers etc. so I'm sure we can find better ways to use the model, but still it's going to be limited when it comes to the final quality, at least without further training.

Personally most interesting use so far has been the reference based image generation.

it's faster but i don't think it's better than hidream1- Except for the editing stuff which has come a long way. But what use is that if it's blurry and stitched. Good that it's worked on but the competition works just fine.

@kijai . Just curious. Are you part of the comfy team or a very strong talented supporter? Your names comes up a lot

Comfy Org org

@kijai . Just curious. Are you part of the comfy team or a very strong talented supporter? Your names comes up a lot

Started as just custom node dev, but I've been full time part of the official backend team since January now.

ComfyUI_temp_izzoj_00008_

It's really good with text.... and composition etc.
A little to be desired when it comes to overall quality perhaps (face, hands, details etc)
But was just a first few attempt ;-) Might be possible to tweak it a bit; prompt, sampler, steps etc - or even a refiner 2nd pass (with same model or other model)

(havent tried the ref image part much yet, but that looks really good as well)

(and the spelling in the title is all my fault.. ComfyUI, not Comfy-UI.. prompted a bit too fast.. haha.)

everything photo is blurry and low detail. Reference images seem to work fine for composition and editing. but that doesn't help a weak result. And it doesn't work better than flux.2 or qwen edit.
Also fairly samy images from the same prompt, even the full model. edit: that's down to the workflow not really having random seeds, only adding noise in the sampler.

Anatomy seems to be less messy than in flux.2 models but that's a low bar that only Ernie could beat (with a shovel)... tbh i come back to qwen edit as reliable workhorse, in all but the resolution it's nearly as good but just less fiddly than others. All the same prompts after the first one: hidream o1 full: hidro1-_00009_ it's not shiny as usual, it kinda fuzzy even prompted a high quality photo...
same prompt for all following:
Hidream o1, full, mxfp8 (wf says it's higher quality). if you make it small it looks ok, tiny bit closer and everything is fuzzy (faces) Also kinda boring the guys look almost cloned (that's the reason i changed the prompt to "different guys":
hidro1-_00003_

Flux.2 klein9b. Amazing visuals, can't count arms (give the right guy the benefit of the doubt that he is jumping):
flux2-_03062_
klein4b for a change the better:
flux2-_03061_

hidream-i1 Full (old) hidream. i wouldn't say it is better than O1 but less fuzzy :ComfyUI_16755_(1)
and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
ComfyUI_16753_

ComfyUI_temp_eqpcy_00024_

ComfyUI_temp_eqpcy_00030_

ComfyUI_temp_eqpcy_00042_

everything photo is blurry and low detail.

Seems ok here. If its the best ever model, probably not.. it's really good with text, and decent compositions.
The ref image feature is the great feature though..

Been trying different samplers, undecided on what works best

Comfy Org org

Some of my outputs I've liked:

banodoco_panda

hidream_purple_witch

hidream_pixel

ComfyUI_temp_krebt_00020_

hidream_nier

Some observations, (mostly using reference images):

  • Base model way better, but requires using the seam fix workaround for the tiling issue
  • You can use higher res than the default
  • deis or res_multistep with beta has worked nicely for me, but too many options here to choose best

Also got some good results with res_multistep. Maybe a good candidate ;-)

ComfyUI_temp_eqpcy_00044_

Deis works really well, even got a bit of skin blemishes and details (that was in the prompt)

Comfy Org org

ComfyUI_temp_eqpcy_00044_

Deis works really well, even got a bit of skin blemishes and details (that was in the prompt)

You can also try adding some of the dev distill as a LoRA, not too much or it will burn it: https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_dev_lora_rank_64_bf16_pruned_v1.safetensors

You can also try adding some of the dev distill as a LoRA, not too much or it will burn

yes that helped a bit as well

tbc the images as such are ok, (but my three guys were still clones). without the fuzziness. Far enough out (or small enough), like the size we see here, and apparently beyond photorealism.
But it is meant to do 2048x2048. It's promising but not great. We'll see what people do with it. And thanks for working on it. Now that Qwen Image seems to go closed (and small) , alternatives are good. Except ernie...

Also found that using the Gemma 4 text generate in comfy, and feeding your prompt with the instruct from HiDream, vastly improved the output.
I used the prompt instruction here https://github.com/HiDream-ai/HiDream-O1-Image/blob/main/prompt_agent.py (but i translated it to english)
It makes a json prompt that the model seems to like a lot ;-)

image

i opened the lady 2 posts up in full size. The skin is pure blur. The back and white old guy further up, even in small size, the hairs looks ok but the skin is completely blurry.
Are they just overselling the resolutions it can do? Maybe someone makes a anti blur or skin lora. That seems to be by far the biggest problem. At least in photorealistic images. And it's a techncial problem of the model. You can prompt as much about no blur, sharpness, skin details or no dof, or change samplers, and it still does it.
The black and white guy is a good example of prompting the hell out of it (tons of prompted skin details, hair and face details to try to fix the problem , and you end up with these typical 100 year olds, even if you prompt someone age 40. It's the only way the poor model can cramp all these prompted details into a face. but the skin is STILL blurry.

With enhanced prompts like this btw i find it a double edged sword anyway. Not buying this new meta about short story sized prompts, that started with Z-Image. because it's very hard to stop it from changing too much. And even Z-image does just fine with a simpler prompt. It just does always the same with it. If you want a randomized/ fancied up version of a core idea, a long, flowery prompt is great. But for something precise it's often more annoying.
And if a model has problems like F2, Ernie with anatomy or HidreamO1 with blur, even a long prompt doesn't change the fundamental flaw. Hence the b/w 100 year old guy.

This model has its strength and weaknesses ... as any other model i guess.
But its open source, so community will evolve on it, if they want. Make fine tuned models, loras and what not ;-)

So the most important part is that its open source

Ah i see there has been added a default workflow inside Comfy now.
With prompt enhancer and more

Try that perhaps. Gives better results

image

Comfy Org org

@RuneXX I noticed you had Shift adjustment in one of your workflows, and realized I had a mistake in the initial ModelNoiseScale node that had two buggy behaviours with the shift adjustments:

  • If the Shift node was after the ModelNoiseScale, it reset the noise scale to the model default (8.0) making the node adjustment do nothing
  • ModelNoiseScale was after the shift, it reseted the shift to model default

PR has been merged now that fixes that and it should work both ways.

Yes i was just experimenting. trying the shift to see how things improved or got worse ;-)
will try again

The model has some serious strength (composition, text, and more .. it looks really "artistic" sometimes).
It does lack a bit in the finer grain details, skin etc, but that might come with community iterations and improvements

I dont know if its just me, but i really like some of the outputs, reminds me of the days I did black and white photography. When you do close up photos, not everything is in focus.
Makes it look more real to me.. . But i do see why some say the skin is plastic etc (but that been said about ai images since sdxl, flux etc etc)

To me it looks abit more like something you'd find in a photography art gallery, while Z-image looks more like a magazine photo.. or something like that ;-)

(images below are stock comfyui workflow with the fp16 full model and a small dash of Kijai's lora (0.3), with res_multistep sampler... if i remember correctly)

hidream_o1_00014_

hidream_o1_00009_

Did they already release an updated model btw?
https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev-2604

From the "sales pitch", it sounds like it depends on the prompt refiner, but i guess thats also true for the previous ones

Comfy Org org

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

The new model is aimed to improve pose following when using something like openpose rig as one of the references, otherwise initial impression is that it just seems... worse in details, even blurrier etc... and it's dev only. I could be doing something wrong still, didn't do any extensive tests yet. Definitely does follow the pose more.

Screenshot 2026-05-14 004340

Screenshot 2026-05-14 003836

Screenshot 2026-05-14 011022

ComfyUI_temp_mcrju_00004_

image

(just a low res test run)

yes the ref image way is for sure good fun. And if you have a character, easy to put into different scenes, different clothing, etc etc.

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

Hey kijai,
​I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
​Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
​Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?

I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!

Heres my nodes if you want to take a look 🤷‍♂️ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.

https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main

and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
ComfyUI_16753_

ernie is really good with prompted skin detail, but yes, the ghost limbs are really bad and i initially thought resolution dependent which is not the case, they are just breaking from time to time. lets also not talk about the training data bias... but ernie is also mostly uncensored or can at least display normal nudity (no hardcore stuff) whereas hidream o1 has never seen a nipple... might not be important for a lot of people but for creating character images, it is nice if the base model can do stuff like that...

00001 - a Menah is a caucasian human youthful adult

00003 - a Link (Legend of Zelda ) is a genderbend version
just some test images

Comfy Org org

Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:

https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_image_dev_2604_lora_avg_rankg_224_bf16.safetensors

Comfy Org org

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

Hey kijai,
​I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
​Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
​Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?

I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!

Heres my nodes if you want to take a look 🤷‍♂️ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.

https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is

I have a set of detailing prompt and ran it through with the full model, it gives some variations but there is still a bit of smoothing happening in the last few steps of the image generation, it also needs to get a bit more variety in its results but we can prompt those as well for the moment. So far each model I tested had their goto face and it always helped to prompt in some ethnicity and more detail. Unlike Chroma or Flux (as well as older models) which is limited to a certain prompt length, newer models can be told a lot of detail in prompt.
Some more examples:
00029 - wearing SPORTSWEAR, athletic tank top and running
00030 - wearing SPORTSWEAR, halter-neck sports top and
00031 - wearing SPORTSWEAR, volleyball jersey and spandex
00032 - wearing SPORTSWEAR, compression crop top and biker

Comfy Org org

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is

It smooths everything out on the last (low) sigmas, if you end the schedule early there's bit more detail, but also the same patch grid artifacts as with the base model. It looks to me the dev model has been trained (either on purpose or by side effect) to smooth out the grid artifacts, which ends up also losing ton of normal detail. Just a theory, don't know anything for sure, I have tried various methods trying to get more quality out of it and really only worthwhile approach seems some sort of hybrid using the base model and the dev as a LoRA at lower strength.

Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.

Works great with WAN2GP. They tend to have day 0 support for weird stuff more often than ComfyUI these days.
It's more memory efficient than comfy too, so you don't have to sacrifice on quality tradeoffs

hidream_o1_00058_

Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:
https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_image_dev_2604_lora_avg_rankg_224_bf16.safetensors

Works nicely
with that lora ;-)

Sign up or log in to comment