sd15-flow-lune / README.md

Update README.md

528181a verified 3 months ago

3.83 kB



	---
	license: mit
	base_model: runwayml/stable-diffusion-v1-5
	tags:
	- stable-diffusion
	- diffusion
	- distillation
	- flow-matching
	- geometric-deep-learning
	- research
	library_name: diffusers
	pipeline_tag: text-to-image
	---

	# Why do I hear boss music?

	## 10000 steps

	Currently retraining the scale, but it was trained with many raw unscaled latents and it makes the default output hazy.
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/6GFXrQy6vm8h2mdkK5mvD.png)
	Use this to correctly orient the output to the correct VAE scale.

	## Shift 2 is the training target
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/3aUl0td4RiDL9yjMw87KT.png)
	Higher or lower may yield different results.

	## use this
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/zXNIFANpK7Yqmm4oPUuUR.png)



	a castle at sunset
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/fOeEzWg-VgA7s8ubmKcnv.png)

	a mountain view with a beautiful landscape
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/Tsk2QSKd6cH0eJ-H_iJ_C.png)

	a woman sitting on the bus
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/UIQ29npfiE1KfFLOJbCZv.png)

	a carrot on a cake
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/hWTxprkxdeu8E_E0iqV8J.png)

	a refrigerator to the left of a table
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_puDnUG_xuazq6soFqfVj.png)

	a mad scientist's laboratory with strange gagets and mechanisms
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/qgZvxpGSwODJ9dxUxi4iA.png)

	steampunk goku
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/IITrYMTxNm3BApR-txYmW.png)


	a man standing on top of a table in the middle of a room full of curtains.
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/P-vleYAQAhHxvXYLLHBjk.png)

	## 5000 steps


	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/QEAkOA49IHvHeLTFvhe-O.png)


	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/LfGEMW5AWdDIf3bFFZsOD.png)


	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/tdwAqMrA6b3zy51G6Wu1k.png)

	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/eaoQ3iY_QIEfhwA5SK0zV.png)

	a mad scientists laboratory
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/xqDeCGbxWMhAfD4QV9w2B.png)

	## 4000 steps
	Utilizing this synthesized image set here:
	https://huggingface.co/datasets/AbstractPhil/sd15-latent-distillation-500k

	As of typing this, the 500k isn't finished synthesizing. It's at around 200k, which should be more than enough to get a baseline.


	At 4000 steps the new flow matching trainer is already manifesting results.
	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_h52WVv4rgvzk2H08Jpmy.png)

	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/n--fn2cNfsYmi7e3SqmXc.png)

	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/XXT9NEEtYtIUrF52hJFWO.png)

	![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/lhXF0_fOUyandv_hUC3xN.png)


	Within 4000 steps at batch 16 the pretrained flow matching SD1.5 model is already building convergence.
	This model was the sd15-flow-matching-try2 aka Lune variation, and I can say for certain she is most definitely not burned.

	The trainer is in the files.