base model or not? because only 12.3 gb?
???
???
not base... theres turbo (distilled), regular zimage, and base. kind of like flux 2, flux 2 klein distilled, and flux 2 klein base models.
at least thats what ive taken from the repo pages.
The regular Z-Image is what people usually refer to as base model.
The other base model would be the Omni version.
You can refer their github for the table.
https://github.com/Tongyi-MAI/Z-Image?tab=readme-ov-file#-model-zoo
It is the base model, the Turbo model is not quantized (smaller) than base, it is just had training to get the number of steps down to 8 (from 24-50) and Reinforcement Learning to push preference-aligned visual quality and good-looking outputs.
The Base model looks quite a bit worse than ZIT right now, but will be better to train on (Loras and Full Finetunes)
It is the base model, the Turbo model is not quantized (smaller) than base, it is just had training to get the number of steps down to 8 (from 24-50) and Reinforcement Learning to push preference-aligned visual quality and good-looking outputs.
The Base model looks quite a bit worse than ZIT right now, but will be better to train on (Loras and Full Finetunes)
if it look worse than how they did this?
only tried a few runs, but looks better to me in some ways (more natural), and for sure tons of more variations on each seed.
Probably the biggest benefit is having variations on each run. And it might take a few attempts to tweak the prompt to get what you wanted, and being "natural" not every result looks like a "studio photographer" did the shot ;-)
ZIT is much faster though, and "instant gratification" with each run giving a good results (but lacking the variations), and a bit "studio shot" and "model like poses"
Both have their strengths
