---
license: openrail
base_model:
- stabilityai/stable-diffusion-xl-base-1.0
library_name: diffusers
---

An experiment, with CLIP L trained with up to 770 tokens with ~10k anime dataset, without adjusting arch. Concatenation is used to accummulate features.

![output(4)](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/vOCBqQ4fhIhD157o4QSuO.png)

![output(5)](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/OBNY_waRMkPR-I_29KN_2.png)

Token-adjusted, with images removed if they ca'nt meet token criteria:

![output(6)](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/fmQfZBgniqdZOTXPynqkO.png)

![output(7)](https://cdn-uploads.huggingface.co/production/uploads/633b43d29fe04b13f46c8988/FnnsQsBGs-LFJFPCebcda.png)