| license: creativeml-openrail-m | |
| language: | |
| - en | |
| library_name: diffusers | |
| pipeline_tag: text-to-image | |
| tags: | |
| - stable-diffusion | |
| - stable-diffusion-diffusers | |
| - text-to-image | |
| inference: | |
| parameters: | |
| num_inference_steps: 50 | |
| guidance_scale: 5.0 | |
| eta: 1.0 | |
| widget: | |
| - text: "a horse playing chess" | |
| example_title: horse + chess | |
| - text: "a lion washing dishes" | |
| example_title: lion + dishes | |
| - text: "a goat riding a bike" | |
| example_title: goat + bike | |
| # ddpo-alignment | |
| This model was finetuned from [Stable Diffusion v1-4](https:/CompVis/stable-diffusion-v1-4) using [DDPO](https://arxiv.org/abs/2305.13301) and a reward function that uses [LLaVA](https://llava-vl.github.io/) to measure prompt-image alignment. See [the project website](https://rl-diffusion.github.io/) for more details. | |
| The model was finetuned for 200 iterations with a batch size of 256 samples per iteration. During finetuning, we used prompts of the form: "_a(n) \<animal\> \<activity\>_". We selected the animal and activity from the following lists, so try those for the best results. However, we also observed limited generalization to other prompts. | |
| Activities: | |
| - washing dishes | |
| - playing chess | |
| - riding a bike | |
| Animals: | |
| - cat | |
| - dog | |
| - horse | |
| - monkey | |
| - rabbit | |
| - zebra | |
| - spider | |
| - bird | |
| - sheep | |
| - deer | |
| - cow | |
| - goat | |
| - lion | |
| - tiger | |
| - bear | |
| - raccoon | |
| - fox | |
| - wolf | |
| - lizard | |
| - beetle | |
| - ant | |
| - butterfly | |
| - fish | |
| - shark | |
| - whale | |
| - dolphin | |
| - squirrel | |
| - mouse | |
| - rat | |
| - snake | |
| - turtle | |
| - frog | |
| - chicken | |
| - duck | |
| - goose | |
| - bee | |
| - pig | |
| - turkey | |
| - fly | |
| - llama | |
| - camel | |
| - bat | |
| - gorilla | |
| - hedgehog | |
| - kangaroo |