| --- |
| license: apache-2.0 |
| tags: |
| - anime |
| - diffusion |
| - text-to-image |
| - image-generation |
| library_name: diffusers |
| pipeline_tag: text-to-image |
| language: |
| - en |
| --- |
| |
|  |
|
|
| # Aniimage-1 |
|
|
| Aniimage-1 is the first latent diffusion model developed by 8BitStudio. |
| The model is a 256x256 anime image generation model trained from scratch using a UNet + VAE + CLIP architecture. |
| Aniimage-1 has been trained on 830,001 anime images from [Danbooru](https://danbooru.donmai.us/). It is not based off of any existing models, the unet is trained from scratch. |
|
|
| ## Model Details |
|
|
| | | | |
| |---|---| |
| | **Resolution** | 256×256 | |
| | **Architecture** | Latent Diffusion (UNet + VAE + CLIP) | |
| | **Parameters** | ~400M | |
| | **Training Steps** | 88,000 | |
| | **Batch Size** | 64 | |
| | **Dataset** | ~830K curated anime images from Danbooru | |
| | **GPU** | NVIDIA RTX 5060 Ti 16GB | |
| | **Scheduler** | DDIM or DPM ++ 2M | |
|
|
| ## Requirements |
|
|
| - **GPU**: ~3.4 GB VRAM minimum (recommend 4+ GB) |
| - **CPU**: ~2 GB RAM. Image generation is extremely slow on cpu. |
|
|
| ## Quick Start |
|
|
| [](https://huggingface.co/8BitStudio/Aniimage-1/resolve/main/generate_hf.py) |
|
|
| after downloading, install the dependencies. |
|
|
| ```bash |
| pip install torch torchvision diffusers transformers safetensors pillow huggingface_hub |
| python generate_hf.py |
| ``` |
|
|
| recommended settings: Scheduler on DPM ++ 2M with 25 steps and a cfg of 7.5. |
| recommended negative prompt: "low quality, ugly, blurry, distorted, deformed, bad anatomy, bad proportions, extra limbs, missing limbs, watermark, |
| text, signature, washed out, flat colors, manga panel, disfigured, poorly drawn, jpeg artifacts, cropped, out of frame" |
|
|
| ## Prompting |
|
|
| Aniimage uses plain text captions meaning for the best result use plain english. |
|
|
| Do "A smiling anime girl with red hair and a school uniform" |
| Not "1girl, solo, smile, red_hair, school_uniform, anime_coloring" |
| |
| ## Capabilities |
| |
| - Anime character generation with varied hair colors and styles |
| - School uniforms, fantasy outfits, maid dresses, and more |
| - Background scenes: cherry blossoms, night sky, interiors, nature |
| |
| ## Limitations |
| |
| - 256×256 resolution — fine details like hands and small features can be rough |
| - Faces can sometimes look similar or 'melty' across different prompts |
| - Complex multi-character scenes may have merging issues |
| - Little to none NSFW content — trained on mostly SFW dataset only |
| - Does worse when generating men due to dataset bias |
| |
| ## What's Next |
| |
| **Aniimage-1.5** — a 512×512 fine-tune of this model is currently in development, which will significantly improve detail and clarity. |
| Code for training may be released at some point on github |
| |
| ## License |
| |
| Apache 2.0 |