| # makeanime | |
| <a href="https://colab.research.google.com/gist/ariG23498/645f0f276612a60fb32ad2b387e0d301/scratchpad.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> | |
|  | |
| `makeanime` is a CLI tool for generating anime-style images from a face image and a text prompt using Stable Diffusion XL and an IP-Adapter for applying anime-like styles. | |
| ## Features | |
| - Generates anime-style images by blending a face image with anime style references. | |
| - Leverages Stable Diffusion XL for high-quality text-to-image generation. | |
| - Uses an IP-Adapter to combine face and anime-style attributes. | |
| - Supports custom prompt input for greater flexibility. | |
| - Allows control over the influence of face and style using weights. | |
| ## Installation | |
| ```shell | |
| $ pip install -Uq git+https://github.com/ariG23498/makeanime | |
| ``` | |
| ## Usage | |
| You can use the `makeanime` CLI to generate images. The tool accepts the following arguments: | |
| - `image`: URL of the face image to be stylized. | |
| - `prompt`: Text prompt to guide the image generation. | |
| - `style_weight`: (Optional) A float that controls how much the anime style influences the image. Default is `0.5`. | |
| - `face_weight`: (Optional) A float that controls how much the face image influences the result. Default is `0.5`. | |
| ### Example Command | |
| ```bash | |
| $ makeanime \ | |
| --image "https://example.com/your-face-image.jpg" \ | |
| --prompt "a man" \ | |
| --style_weight 0.7 \ | |
| --face_weight 0.3 | |
| ``` | |
| This command will generate an anime-style image based on the provided face image URL and prompt. The resulting image will be saved as `output.png` in the working directory. | |
| ## File Structure | |
| - `makeanime/app.py`: Contains the main logic for generating anime-style images. | |
| - `makeanime/__main__.py`: Sets up the CLI using Python's `Fire` library. | |
| ## How it Works | |
| - **CLIPVisionModelWithProjection** is used to encode the input face image. | |
| - **Stable Diffusion XL** is used for generating images based on the text prompt and the encoded face. | |
| - An **IP-Adapter** is loaded to modulate the anime style and face weights. | |
| - Images are generated at 1024x1024 resolution, and the output is a grid of the original face image and the generated anime image. | |
| ## Requirements | |
| - Python 3.10+ | |
| - PyTorch | |
| - Diffusers | |
| - Transformers | |
| - Fire | |
| ## References | |
| The code is taken from the [Hugging Face IP-Adapter Guide](https://huggingface.co/docs/diffusers/main/en/using-diffusers/ip_adapter) | |