Corrupted data in dataset?

#86

by LLLMI - opened Mar 16

•

Me and other people have seen preview2 model generating incomplete images or images with strange rectangular empty parts. Not sure what's the reason. May be images with corrupted data poisoned the result. It would be better to check dataset for incomplete images before next round of training, I think.

sammyt123

Mar 16

can he realistically do anything with that little information? shouldn't you give a reproducible prompt at least?

willybender

Mar 16

•

edited Mar 17

Do you have an example image to show with the metadata? I've generated thousands of images with preview2 and have never seen this.

I'm thinking the OP could be using the "anime screencap" tag with a very short prompt that lacks details (although I tried this myself and haven't been able to replicate what they described)

NovelAI 4.5 has an issue with the "anime screencap" tag, and you get around it by either making the prompt longer - or adjusting the negative prompt to include stuff like "negative space, blank page" (which is included in the default NAI negative prompt).

Ysshisusgsh

Mar 17

same thing happened to me but still cool

Darudado

Mar 17

post the images with the prompt

synta

Mar 18

•

edited Mar 18

I agree, I only have seen this with anime screenshots, if this is what was meant by "strange rectangular empty partts"?. I see it as sort of augmentation effect of cinematic screenshot data with letterboxed borders at the top and bottom because i'm forcing screenshot material into a vertical image ratio. But deleting all black letterboxed borders from the dataset would be bad too because letterboxes are desired sometimes and are a visual feature a model should know (and especially Anima is quite good at making the cinematic thickness of those bars, if you generate widescreen images).

Comuse123

Mar 18

I agree, I only have seen this with anime screenshots, if this is what was meant by "strange rectangular empty partts"?. I see it as sort of augmentation effect of cinematic screenshot data with letterboxed borders at the top and bottom because i'm forcing screenshot material into a vertical image ratio. But deleting all black letterboxed borders from the dataset would be bad too because letterboxes are desired sometimes and are a visual feature a model should know (and especially Anima is quite good at making the cinematic thickness of those bars, if you generate widescreen images).

Yeah I've gotten quite a few images that have this letterboxed border now. Anime screenshot does it the most, but you can sometimes also get it when prompting for dark/low brightness images. SDXL models did this too but nowhere near as much as Anima in my experience, it's a little strange

synta

Mar 18

I agree, I only have seen this with anime screenshots, if this is what was meant by "strange rectangular empty partts"?. I see it as sort of augmentation effect of cinematic screenshot data with letterboxed borders at the top and bottom because i'm forcing screenshot material into a vertical image ratio. But deleting all black letterboxed borders from the dataset would be bad too because letterboxes are desired sometimes and are a visual feature a model should know (and especially Anima is quite good at making the cinematic thickness of those bars, if you generate widescreen images).

Yeah I've gotten quite a few images that have this letterboxed border now. Anime screenshot does it the most, but you can sometimes also get it when prompting for dark/low brightness images. SDXL models did this too but nowhere near as much as Anima in my experience, it's a little strange

Oh yea, i believe I know what you mean. In particular it was the combination of dark scenes with lots of natlang that triggers weird shifts for me. Especially spatial prompts like "left side of the image" or something like this leading to the image being quite a small rectangular with content shiftet to the left on a black background (80% of the total image) for the rest of it. Or that natlang would lead to exessive destroyed perspective and completely ignore any image composition booru tags (like close-up, upper body, full body etc) although the natlang didnt contradict any of these.

Another persistent issue is that if you use natlang to describe a scene involving multiple characters doing the same thing and then want to add a contrasting other figure then this figure will always be scaled much smaller than what the model is considering the main characters of the scene. Even if you have tags and everything.

jkrauss82

Mar 19

obviously, training data will incorporate these kind of images: https://danbooru.donmai.us/posts?tags=border

I suggest you put border, letterboxed, pillarboxed in your negative and see how it goes.

synta

Mar 20

•

edited Mar 20

i find that natlang seems to fight with tags in principle. You need to either add weight by writing natlang also for the traits described via tag or increase weighting of the entire booru tag section of you prompt to counterbalance.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment