Training Settings for LORAs

by Hellomattieo - opened Feb 2

Feb 2

Would you mind sharing your training settings or even the json that you use for training your Z-image LORAs? They work so well!

nphSi

Owner Feb 2

•

edited Feb 23

The Onetrainer settings are embedded in every Lora i made but i do not think the training settings matter much because they are pretty simple for my 8GB VRam card. The quality of the dataset and using masks correctly does matter much more. Some tips:

Aim for 25-35 images and around 2500 steps
Use AI upscalers and refiners like Topaz etc. Scale clean images up to 25 MP and finally down to 1MP.
Use real 1MP as scale to value and not "longest side 1024pix" because this will result in 0.7MP for a 2:3 image. To prevent doing maths for every Image i use Irfanview to downscale because it accepts MP (use 1.02 because 1MP = 1000x1000 and not 1024x1024 for Ifranview) (MP=Megapixel)
Have at least about 10-15 face closeups from every direction without much makeup, neutral expression and light and cover as much hair as possible, remove excessive used jewellery.
Prevent having more than 2-3 similar styled images or they will predominate the output. (i.e same necklace in 5 images will result with necklace in nearly every generated image)
Use mask to cover everything that is not relevant like text overlay, jewellery and other persons.
Mask out eyes or whole faces in distance if the eyes are not clearly visible (because we dont need them if we have closeups) but keep as much hair as possible.
Do not mask out beautiful backgrounds like beaches, nice rooms/locations as they can give the lora some style.
Have about 3 full body images with backgrounds the model can use as reference for the body proportions (height), like other people (mask them out), cars, buildings etc.
Its really really important to only use highest quality images. No video screen captures or scans (if your AI upscaler can not fix it). Feed the model only with quality you expect to get back on generation!

Masks do not mean the model cant see behind them. The model will see and learn stuff behind mask but at a much lower weight (10%) and so It becomes irrelevant.
Use "Dataset-Tools" in OT and the model "Rembg-Human" to generate the Masks, refine them in the Mask editor. (Do not forget to press "Enter" for saving after editing)

Caption is simply always "vrtlFirstnameLastname, person" for character loras.

I hope that helps a bit. Ask if anything is unclear.

As for settings, for native "Base" Z-Image training i currently use Timestep Shift 4.5, Noise Bias 0.2, Float (W8A8) and a relatively high LR of 0.0014 (depends on your optimizer) with cosine and EMA of 0.99

An easy but effective way to clean images is to upscale them up sky high (like 8000px) using bicubic, sharpen a bit and downscale to final res using bilinear. Its cheap but better than nothing.

Typical Image and Mask:

nphSi pinned discussion Feb 2

nphSi unpinned discussion Feb 6

Jadawin84

Mar 13

•

edited Mar 13

Thanks to your support I was able to train my first Lora with Onetrainer, took only 28 minutes on my 4070TI with 16 GB Vram (64 GB Ram). The outcome is first-class. Considering it took AI Toolkit hours to train for Flux1 and the results don't even come close to those of Z-Image (with Redcraft Distilled Base Model), it is really quite remarkable.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment