Thanks for writing the article. I'm unclear about the following:
What should be entered in each dataset's image description? short non-word token to identify the image? Is this token unique to each image in the dataset? I thought each image should have its own paragraph to describe the image.
If I set Quantization > Transformer & Text Encoder set to None, what are the side effects?
My objective is to train a LoRA of a person with ~20 images of the same person, what kind of images should I prepare? full-body, or just portraits?