ddpo-compressibility

This model was finetuned from Stable Diffusion v1-4 using DDPO and a reward function encouraging images that are JPEG-compressible. See the project website for more details.

The model was finetuned for 60 iterations with a batch size of 256 samples per iteration. During finetuning, it was prompted with all of the animals in the Imagenet-1000 categories (the first 398 categories), but it exhibits some generalization to other prompts.

Downloads last month: 8

Paper for kvablack/ddpo-compressibility

Training Diffusion Models with Reinforcement Learning

Paper • 2305.13301 • Published May 22, 2023 • 5