| | --- |
| | license: openrail |
| | datasets: |
| | - ChristophSchuhmann/LAION-5B-EN-Aesthetics-Subset_above_6 |
| | --- |
| | |
| |  |
| |
|
| | Image Mixer is a model that lets you combine the concepts, styles, and compositions from multiple images (and text prompts too) and generate new images. |
| |
|
| | It was trained by [Justin Pinkney](https://www.justinpinkney.com) at [Lambda Labs](https://lambdalabs.com/). |
| |
|
| | ## Training details |
| |
|
| | This model is a fine tuned version of [Stable Diffusion Image Variations](https://huggingface.co/lambdalabs/sd-image-variations-diffusers) |
| | it has been trained to accept multiple CLIP embedding concatenated along the sequence dimension (as opposed to 1 in the original model). |
| | During training up to 5 crops of the training images are taken and CLIP embeddings are extracted, these are concatenated and used as the conditioning for the model. |
| | At inference time, CLIP embeddings from multiple images can be used to generate images which are influence by multiple inputs. |
| |
|
| | Training was done at 640x640 on a subset of LAION improved aesthetics, using 8xA100 from [Lambda GPU Cloud](https://cloud.lambdalabs.com). |
| |
|
| | _Note text captions were not used during training of the model, |
| | although input text embeddings works to some extent during inference, the model is primarily designed to accept image embeddings_ |
| |
|
| | ## Usage |
| |
|
| | The model is available on [huggingface spaces](https://huggingface.co/spaces/lambdalabs/image-mixer-demo) or to run locally do the following: |
| |
|
| | ```bash |
| | git clone https://github.com/justinpinkney/stable-diffusion.git |
| | cd stable-diffusion |
| | git checkout 1c8a598f312e54f614d1b9675db0e66382f7e23c |
| | python -m venv .venv --prompt sd |
| | . .venv/bin/activate |
| | pip install -U pip |
| | pip install -r requirements.txt |
| | python scripts/gradio_image_mixer.py |
| | ``` |
| |
|
| | Then navigate to the gradio demo link printed in the terminal. |
| |
|
| | For details on how to use the model outside the app refer to the [`run` function](https://github.com/justinpinkney/stable-diffusion/blob/c1963a36a4f8ce23784c8247fa1af0e34e02b766/scripts/gradio_image_mixer.py#L79) in `gradio_image_mixer.py` in the [original repo](https://github.com/justinpinkney/stable-diffusion#image-mixer) |