Model Use
#4
by
dearest
- opened
Good job on the model! It's comfortable to use even for a preview.
It has surprisingly strong knowledge on all manner of things and the prompt comprehension is strong.
I've got some questions, hopefully not too much of a bother.
I feel like Image Generation has become so popular now that certain capabilities and discussions are important to have:
- Prompt weighting. This appears to work but I'd like to check if it's correct:
For example1girl, blue eyes, red hair, (running:0.6) - Style mixing. It seems like individually styles can be prompted but can multiple styles be subsequently mixed? The result tends to lose all referenced styles after 2+ different styles are prompted.
- Token count. Are there any pitfalls or limits to the length of a prompt?
- Character prompting. As of right now, it seems like the model can recall all manner of popular characters, but only if you subsequently pad out the prompt with character details. Where other models tend to recall a character from their name alone, it seems like this model sometimes requires a bit more detail. Also, when prompting multiple characters, how feasible is it to associate features to a specific character?
- Placement. When asking for objects/composition, how can the locations be prompted in a reliable way?
- Inpainting. Are there workflows available for inpainting?
- Training. Are there plans for LoRA or other model training in the future?
- Few step. As it stands, 30 steps takes around 6-10 seconds on a decently modern GPU. That's not a problem in and of itself but faster is better, assuming the model is not harmed. Are there plans to explore different inference options? (Quants/precision, distillation, whatever else?).
Thanks for this great addition to the space!
20 steps seems to work great and when go down to 12 steps it gives semi realistic outputs , unique on its own. sage attention works and also wanvae2.1-upscale2x vae works too. Something like 1024x1536 generates with small errors.