File size: 2,790 Bytes

---
license: mit
base_model: runwayml/stable-diffusion-v1-5
tags:
  - stable-diffusion
  - diffusion
  - distillation
  - flow-matching
  - geometric-deep-learning
  - research
library_name: diffusers
pipeline_tag: text-to-image
---
# The finetune process for dreambooth

I'll be gathering SD15 latents en-masse using massive batches of sd15 images for common classes.

This time I'll be directly sampling from laion flavors, roughly 500,000 or so 512x512 3 channel images are required to exist as latents.

The synthetic caption system is biased already so that CAN be used, but the biases don't necessarily line up with SD15 so they must be handled in a lesser extent.
Essentially only gap fillers for commonly used tokens that aren't being filled by the common laion flavors will be targeted and synthesized.

Once the balanced dataset is created, then and only then, can we cook.

The dataset will be available to all and the code will be in the repo for replication.

Most likely it will include it's prompt because running clip_l can have a really high batch size so I can feed CLIP_L a ton of prompts with the correct seeds to match and be fine
This will additionally allow me to shuffle tokens for better sd15 generaliation, but I don't know if I'll enable that. I kind of need it be sd15 before I start jiggering with it's insides.

I'll try to get the latent system synthesizing and preparing latents by the end of the day, hopefully they will be done in the next couple days or perhaps rapidly. In any case, the plan stands.

Teacher's latents go into the student for noise learning. Student's output image is compared to the teacher's. Rinse and repeat. I'll create a new trainer colab without david since he's not going to be required for this one.
Additionally, the subsystems will be based directly on known working systems with solid and concise objectives that make sense to the observer.

Lets do this so it works this time. 


# I've decided to name this model

* SD15 Lune - the twin sister of SD15 Sol

# Epoch late 20s are showing detail

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/I2_Z5FSdRKKuf4HDQrgDs.png)

# Trainer 2 with correct shift

This is the second train currently running alongside the first. It's using the same trainer currently established.

# Epoch 17 samples

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/91YmKqEhot6kS1vhV5VnX.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/8zVCU0h7Mz1svrNsw7Ppr.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/9SMD5gWPfxcCFqba5OYWG.png)

This one is starting over with the new trainer, lets see if it fries early or converges properly.