|
|
--- |
|
|
base_model: |
|
|
- stable-diffusion-v1-5/stable-diffusion-v1-5 |
|
|
datasets: |
|
|
- opendiffusionai/cc12m-small-squarish-simple |
|
|
--- |
|
|
|
|
|
# What is this? |
|
|
This is an initial version of Stable Diffusion 1.5 base model, with its noise scheduler/prediction replaced with |
|
|
FlowMatchEulerDiscrete |
|
|
|
|
|
This model probably has a buncha low quality stuff in it. Base model SD might give better output in many reguards. |
|
|
The reason this model exists is to allow other people to take advantage of FlowMatch for their own finetunes |
|
|
and other experiments. |
|
|
|
|
|
For that reason, this is a FULL FP32 precision model. But the sample code below loads it as bf16. |
|
|
|
|
|
# Usage note |
|
|
|
|
|
Original diffusers module for stable_diffusion has a hardcode that stops this working. I have submitted a patch |
|
|
that was accepted.. but as far as I know, it has not been added to an official release yet. So, |
|
|
"diffusers 0.34.0" wont work with it. |
|
|
That means that to use this, you currently need to either use my tweaked code, [imgsample-hacked.py](imgsample-hacked.py) |
|
|
or manually add in the main git version to use this. |
|
|
eg: |
|
|
|
|
|
pip install git+https://github.com/huggingface/diffusers |
|
|
|
|
|
|
|
|
You should then be able to do the typical diffusers code. For example: |
|
|
|
|
|
from diffusers import DiffusionPipeline |
|
|
import torch.nn as nn, torch, types |
|
|
import os,sys |
|
|
MODEL="opendiffusionai/sd-flow-alpha" |
|
|
pipe = DiffusionPipeline.from_pretrained( |
|
|
MODEL, use_safetensors=True, |
|
|
safety_checker=None, requires_safety_checker=False, |
|
|
torch_dtype=torch.bfloat16, |
|
|
) |
|
|
pipe.enable_sequential_cpu_offload() |
|
|
prompt="Some pretty photo of something" |
|
|
images = pipe(prompt, num_inference_steps=args.steps, generator=generator).images |
|
|
for i,image in enumerate(images): |
|
|
fname=f"{OUTDIR}/sample{i}.png" |
|
|
print(f"saving to {fname}") |
|
|
image.save(fname) |
|
|
|
|
|
## ComfyUI note |
|
|
|
|
|
From the author: |
|
|
|
|
|
>It works fine in comfy, just load the unet with the load diffusion model node and hook it to a |
|
|
> ModelSamplingSD3 node. |
|
|
> |
|
|
>For the clip/vae you can just use the one from the SD1.5 checkpoint." |
|
|
|
|
|
|
|
|
# Making your own FlowMatch model |
|
|
|
|
|
Doing the training itself, did not take that long. |
|
|
Writing [my own functional training code](https://ppbrown@github.com/ppbrown/ai-training), and trying various pathways to find what works, took WEEKS. |
|
|
|
|
|
That, and putting together a 40k clean |
|
|
[ALL-SQUARE IMAGE DATASET](https://huggingface.co/datasets/opendiffusionai/cc12m-small-squarish-simple) |
|
|
|
|
|
If you wanted to recreate your own from scratch, here's the details from one of my runs: |
|
|
(This only takes a few hours to complete, on a 4090) |
|
|
|
|
|
First, download the sd base model in diffusers format, and hand-edit the [model_config.json](model_config.json) and |
|
|
[scheduler/scheduler_config.json](scheduler/scheduler_config.json) file. |
|
|
(I was going to detail it here, but... just copy/look at the files in this repo. I linked them, after all!) |
|
|
|
|
|
(Batchsize 40, accum=1 for all) |
|
|
|
|
|
* time blocks only, 1e-5, 350 steps (result very murky here, thats expected) |
|
|
* up.0 and up.1, 1e-6, 75 steps |
|
|
* mid, 1e-6, 60 steps |
|
|
* up.2, 1e-6, 160 steps |
|
|
* up.3, 1e-6, 120 steps |
|
|
|
|
|
## Sampling |
|
|
During the first phase, maybe sample every 50 steps. |
|
|
After the first phase, you'll want to take samples every 10 steps. Make sure you use MULTIPLE samples, |
|
|
and ideally of different types. You should have at least one "single token" prompt, and then a few more |
|
|
complex ones. |
|
|
|
|
|
|
|
|
 |