Model Use

by dearest - opened Jan 31

•

Good job on the model! It's comfortable to use even for a preview.
It has surprisingly strong knowledge on all manner of things and the prompt comprehension is strong.

I've got some questions, hopefully not too much of a bother.
I feel like Image Generation has become so popular now that certain capabilities and discussions are important to have:

Prompt weighting. This appears to work but I'd like to check if it's correct:
For example 1girl, blue eyes, red hair, (running:0.6)
- This appears to be the correct syntax.
Style mixing. It seems like individually styles can be prompted but can multiple styles be subsequently mixed? The result tends to lose all referenced styles after 2+ different styles are prompted.
- Can be sidestepped with LoRAs that appear to mix fine.
Token count. Are there any pitfalls or limits to the length of a prompt?
Character prompting. As of right now, it seems like the model can recall all manner of popular characters, but only if you subsequently pad out the prompt with character details. Where other models tend to recall a character from their name alone, it seems like this model sometimes requires a bit more detail. Also, when prompting multiple characters, how feasible is it to associate features to a specific character?
- The text encoder (?) is generally leaky, not the worst thing ever just something to live with.
Placement. When asking for objects/composition, how can the locations be prompted in a reliable way?
- This model appears to support ControlNet or some variant of it (reference latents). As a result this sort of feature could be trained in.
Inpainting. Are there workflows available for inpainting?
- NeoForge appears to support this just fine.
~~Training. Are there plans for LoRA or other model training in the future?~~
- Solved, training code already exists in various forms.
Few step. As it stands, 30 steps takes around 6-10 seconds on a decently modern GPU. That's not a problem in and of itself but faster is better, assuming the model is not harmed. Are there plans to explore different inference options? (Quants/precision, distillation, whatever else?).
- From the public few-step LoRAs, perhaps not worth.

Thanks for this great addition to the space!

patientxtr

Feb 1

20 steps seems to work great and when go down to 12 steps it gives semi realistic outputs , unique on its own. sage attention works and also wanvae2.1-upscale2x vae works too. Something like 1024x1536 generates with small errors.

hum-ma

Feb 8

Quants/precision, distillation, whatever else?

There are many quants now, and if you don't need negative prompts it can run with CFG 1.0 with the LoRA over there: https://civitai.com/models/2364703/rdbt-anima

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment