pixelnet / README.md
thomaseding's picture
Add usage detail
364f149
|
raw
history blame
2.64 kB
metadata
license: creativeml-openrail-m

PixelNet (Thomas Eding)

About:

PixelNet is a ControlNet model for Stable Diffusion. It takes a checkerboard image as input, which is used to control where logical pixels are to be placed.

This is currently an experimental proof of concept. I trained this using on around 2000 generated pixel-art/pixelated images that I generated using Stable Diffusion (with a lot of cleanup and manual curation). The model is not very good, but it does work on grid sizes of about a max of 64 checker "pixels" for square generations. I did find that using 128x64 pattern still seemed to work moderately well for a 1024x512 image.

The model only works with the "Balanced" ControlNet setting. The ControlNet sliders do not appear to impact the generation.

Usage:

To install, copy the .safetensors and .yaml files to your Automatic1111 ControlNet extension's model directory like (e.g. sd-webui-controlnet/models)

There is no preprocessor. Instead, supply a black and white checkerboard image as the control input. Examples are in the example-control-images directory of this repository.

The script gen_checker.py can be used to generate checkerboard images of arbitrary sizes.

grid5x5 grid16x16

FAQ:

Q: Why is this needed? Can't I use a post-processor to downscale the image? A: From my experience SD has a hard time creating genuine pixel art (even with dedicated base models and loras), where it has a mismatch of pixel sizes, smooth curves, etc. What appears to be a straight line at a glance, might bend around. This can cause post-processors to create artifacts based on quantization rounding a pixel to a position one pixel off in some direction. This model is intended to fix that.

Q: Will there be a better trained model of this in the future? A: I hope so. I will need to curate a much larger and higher-quality dataset, which might take me a long time. Regardless, I plan on making the control more faithful to the control image and to generalize to more than just checkerboards.

Sample Outputs:

sample1 sample2 sample3