8BitStudio
/

Aniimage-1

image-generation

Model card Files Files and versions

Aniimage-1 / README.md

8BitStudio's picture

Update README.md

9ee8152 verified 4 days ago

|

history blame contribute delete

2.77 kB

	---
	license: apache-2.0
	tags:
	- anime
	- diffusion
	- text-to-image
	- image-generation
	library_name: diffusers
	pipeline_tag: text-to-image
	language:
	- en
	---

	![Aniimage-1 Samples](collage.png)

	# Aniimage-1

	Aniimage-1 is the first latent diffusion model developed by 8BitStudio.
	The model is a 256x256 anime image generation model trained from scratch using a UNet + VAE + CLIP architecture.
	Aniimage-1 has been trained on 830,001 anime images from [Danbooru](https://danbooru.donmai.us/). It is not based off of any existing models, the unet is trained from scratch.

	## Model Details

	\| \| \|
	\|---\|---\|
	\| Resolution \| 256×256 \|
	\| Architecture \| Latent Diffusion (UNet + VAE + CLIP) \|
	\| Parameters \| ~400M \|
	\| Training Steps \| 88,000 \|
	\| Batch Size \| 64 \|
	\| Dataset \| ~830K curated anime images from Danbooru \|
	\| GPU \| NVIDIA RTX 5060 Ti 16GB \|
	\| Scheduler \| DDIM or DPM ++ 2M \|

	## Requirements

	- GPU: ~3.4 GB VRAM minimum (recommend 4+ GB)
	- CPU: ~2 GB RAM. Image generation is extremely slow on cpu.

	## Quick Start

	[![Download Generator](https://img.shields.io/badge/Download-generate__hf.py-blue?style=for-the-badge)](https://huggingface.co/8BitStudio/Aniimage-1/resolve/main/generate_hf.py)

	after downloading, install the dependencies.

	```bash
	pip install torch torchvision diffusers transformers safetensors pillow huggingface_hub
	python generate_hf.py
	```

	recommended settings: Scheduler on DPM ++ 2M with 25 steps and a cfg of 7.5.
	recommended negative prompt: "low quality, ugly, blurry, distorted, deformed, bad anatomy, bad proportions, extra limbs, missing limbs, watermark,
	text, signature, washed out, flat colors, manga panel, disfigured, poorly drawn, jpeg artifacts, cropped, out of frame"

	## Prompting

	Aniimage uses plain text captions meaning for the best result use plain english.

	Do "A smiling anime girl with red hair and a school uniform"
	Not "1girl, solo, smile, red_hair, school_uniform, anime_coloring"

	## Capabilities

	- Anime character generation with varied hair colors and styles
	- School uniforms, fantasy outfits, maid dresses, and more
	- Background scenes: cherry blossoms, night sky, interiors, nature

	## Limitations

	- 256×256 resolution — fine details like hands and small features can be rough
	- Faces can sometimes look similar or 'melty' across different prompts
	- Complex multi-character scenes may have merging issues
	- Little to none NSFW content — trained on mostly SFW dataset only
	- Does worse when generating men due to dataset bias

	## What's Next

	Aniimage-1.5 — a 512×512 fine-tune of this model is currently in development, which will significantly improve detail and clarity.
	Code for training may be released at some point on github

	## License

	Apache 2.0