recoilme
/

sd-flow-alpha

Model card Files Files and versions

sd-flow-alpha / README.md

recoilme's picture

Upload folder using huggingface_hub

e2972d2 verified 2 months ago

|

history blame contribute delete

3.51 kB

	---
	base_model:
	- stable-diffusion-v1-5/stable-diffusion-v1-5
	datasets:
	- opendiffusionai/cc12m-small-squarish-simple
	---

	# What is this?
	This is an initial version of Stable Diffusion 1.5 base model, with its noise scheduler/prediction replaced with
	FlowMatchEulerDiscrete

	This model probably has a buncha low quality stuff in it. Base model SD might give better output in many reguards.
	The reason this model exists is to allow other people to take advantage of FlowMatch for their own finetunes
	and other experiments.

	For that reason, this is a FULL FP32 precision model. But the sample code below loads it as bf16.

	# Usage note

	Original diffusers module for stable_diffusion has a hardcode that stops this working. I have submitted a patch
	that was accepted.. but as far as I know, it has not been added to an official release yet. So,
	"diffusers 0.34.0" wont work with it.
	That means that to use this, you currently need to either use my tweaked code, [imgsample-hacked.py](imgsample-hacked.py)
	or manually add in the main git version to use this.
	eg:

	pip install git+https://github.com/huggingface/diffusers


	You should then be able to do the typical diffusers code. For example:

	from diffusers import DiffusionPipeline
	import torch.nn as nn, torch, types
	import os,sys
	MODEL="opendiffusionai/sd-flow-alpha"
	pipe = DiffusionPipeline.from_pretrained(
	MODEL, use_safetensors=True,
	safety_checker=None, requires_safety_checker=False,
	torch_dtype=torch.bfloat16,
	)
	pipe.enable_sequential_cpu_offload()
	prompt="Some pretty photo of something"
	images = pipe(prompt, num_inference_steps=args.steps, generator=generator).images
	for i,image in enumerate(images):
	fname=f"{OUTDIR}/sample{i}.png"
	print(f"saving to {fname}")
	image.save(fname)

	## ComfyUI note

	From the author:

	>It works fine in comfy, just load the unet with the load diffusion model node and hook it to a
	> ModelSamplingSD3 node.
	>
	>For the clip/vae you can just use the one from the SD1.5 checkpoint."


	# Making your own FlowMatch model

	Doing the training itself, did not take that long.
	Writing [my own functional training code](https://ppbrown@github.com/ppbrown/ai-training), and trying various pathways to find what works, took WEEKS.

	That, and putting together a 40k clean
	[ALL-SQUARE IMAGE DATASET](https://huggingface.co/datasets/opendiffusionai/cc12m-small-squarish-simple)

	If you wanted to recreate your own from scratch, here's the details from one of my runs:
	(This only takes a few hours to complete, on a 4090)

	First, download the sd base model in diffusers format, and hand-edit the [model_config.json](model_config.json) and
	[scheduler/scheduler_config.json](scheduler/scheduler_config.json) file.
	(I was going to detail it here, but... just copy/look at the files in this repo. I linked them, after all!)

	(Batchsize 40, accum=1 for all)

	* time blocks only, 1e-5, 350 steps (result very murky here, thats expected)
	* up.0 and up.1, 1e-6, 75 steps
	* mid, 1e-6, 60 steps
	* up.2, 1e-6, 160 steps
	* up.3, 1e-6, 120 steps

	## Sampling
	During the first phase, maybe sample every 50 steps.
	After the first phase, you'll want to take samples every 10 steps. Make sure you use MULTIPLE samples,
	and ideally of different types. You should have at least one "single token" prompt, and then a few more
	complex ones.


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/655bca86bb95edf97882ae7c/ZPtK059yFO4MMseKMWor9.png)