File size: 3,829 Bytes



---
license: mit
base_model: runwayml/stable-diffusion-v1-5
tags:
  - stable-diffusion
  - diffusion
  - distillation
  - flow-matching
  - geometric-deep-learning
  - research
library_name: diffusers
pipeline_tag: text-to-image
---

# Why do I hear boss music?

## 10000 steps

Currently retraining the scale, but it was trained with many raw unscaled latents and it makes the default output hazy.
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/6GFXrQy6vm8h2mdkK5mvD.png)
Use this to correctly orient the output to the correct VAE scale.

## Shift 2 is the training target
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/3aUl0td4RiDL9yjMw87KT.png)
Higher or lower may yield different results.

## use this
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/zXNIFANpK7Yqmm4oPUuUR.png)



a castle at sunset
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/fOeEzWg-VgA7s8ubmKcnv.png)

a mountain view with a beautiful landscape
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/Tsk2QSKd6cH0eJ-H_iJ_C.png)

a woman sitting on the bus
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/UIQ29npfiE1KfFLOJbCZv.png)

a carrot on a cake
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/hWTxprkxdeu8E_E0iqV8J.png)

a refrigerator to the left of a table
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_puDnUG_xuazq6soFqfVj.png)

a mad scientist's laboratory with strange gagets and mechanisms
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/qgZvxpGSwODJ9dxUxi4iA.png)

steampunk goku
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/IITrYMTxNm3BApR-txYmW.png)


a man standing on top of a table in the middle of a room full of curtains.
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/P-vleYAQAhHxvXYLLHBjk.png)

## 5000 steps


![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/QEAkOA49IHvHeLTFvhe-O.png)


![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/LfGEMW5AWdDIf3bFFZsOD.png)


![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/tdwAqMrA6b3zy51G6Wu1k.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/eaoQ3iY_QIEfhwA5SK0zV.png)

a mad scientists laboratory
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/xqDeCGbxWMhAfD4QV9w2B.png)

## 4000 steps
Utilizing this synthesized image set here:
https://huggingface.co/datasets/AbstractPhil/sd15-latent-distillation-500k

As of typing this, the 500k isn't finished synthesizing. It's at around 200k, which should be more than enough to get a baseline.


At 4000 steps the new flow matching trainer is already manifesting results.
![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_h52WVv4rgvzk2H08Jpmy.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/n--fn2cNfsYmi7e3SqmXc.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/XXT9NEEtYtIUrF52hJFWO.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/lhXF0_fOUyandv_hUC3xN.png)


Within 4000 steps at batch 16 the pretrained flow matching SD1.5 model is already building convergence.
This model was the sd15-flow-matching-try2 aka Lune variation, and I can say for certain she is most definitely not burned.

The trainer is in the files.