Diffusion Single File
comfyui

Missing artists that should work but don't

#73
by synta - opened

Hey there,

first of all: great model, I love it. Completely switched over to Anima in the last couple weeks and I really enjoy it. However, I've found an artist that should work but doesn't. He got almost 1k images under his name on booru and works obviously perfectly fine on sdxl: The creator of Ghost in the Shell https://danbooru.donmai.us/posts?tags=shirow_masamune+&z=5

And I don't mean it works bad or anything. What I mean is that it doesn't seem that his data is present in the model at all. Spammed all his most used "franchises", his character cyril brooklyn, @artistname and still get random junk that doesn't have any similarities. And this is an oddity. Why is that? Did he met some backdated data cutoff treshhold that removed his library entirely? Because about 80% of his stuff is older than 10 years I think and alot of it like 13-17 years old uploads.

So this is the question: What mechanism (accidently?) removed this artist entirely from the training dataset? And if you guys have found more artists that got punished by this mechanism please share. Perhaps this also affects some characters?

Thanks

Just tested, this artist works under @shirou masamune. it seems like danbooru renamed his tag, and an older one is in the dataset anima is trained on.

Just tested, this artist works under @shirou masamune. it seems like danbooru renamed his tag, and an older one is in the dataset anima is trained on.

They might have, but shirou masamune is the gelbooru tag and it seems like the model might prefer gelbooru tags over danbooru ones for artists at least

Just tested, this artist works under @shirou masamune. it seems like danbooru renamed his tag, and an older one is in the dataset anima is trained on.

They might have, but shirou masamune is the gelbooru tag and it seems like the model might prefer gelbooru tags over danbooru ones for artists at least

Holy smokes! Thank you so much for pointing that out. As expected from Anima the style reproduction is spot on. Amazing.

A lot of this person's art is variant sets (sometimes not labelled as such), with often poor tagging and in general very visually similar, I wonder if it got deduplicated.
Also 1k images I think is kinda low even if it didn't, the current preview model is of course undertrained whereas something like Noobai Vpred has had training from the group that did Noob, upon training from Onoma/Illustrious, upon IIRC training from whoever did KohakuXL. Lots of training.
It struggles with cross-eyed, another 1k images tag, and will make literal crosses in the eyes (not too bad, lol). Also, I didn't test much but it seemed to do black jacket Haruhi worse than most SDXL models, and on par with Neta, which I'd expect.

Edit: Oh, nevermind then, I guess.

Also 1k images I think is kinda low even if it didn't

In Anima at around a 100 images on an artist it starts to do something (more than sdxl because qwen VAE is superior), at around 200 you can say "yeah, that's him" in my experience. 1000 images is overkill, even for an undertrained model.

In Anima at around a 100 images on an artist it starts to do something (more than sdxl because qwen VAE is superior), at around 200 you can say "yeah, that's him" in my experience. 1000 images is overkill, even for an undertrained model.

I doubt it's the VAE as the Qwen VAE is REPA-less, I imagine it should be around Flux 1's VAE in learnability which is worse than SDXL... Probably something else, having trained a few loras now, the model seems to learn all sorts of rarer concepts in what I feed it really well, not just styles, though I'm still figuring some things out.

Sign up or log in to comment