File size: 5,036 Bytes
344d3c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
---
license: mit
tags:
- pytorch
- diffusers
- unconditional-image-generation
- diffusion-models
- anime
- anime-faces
- ddpm
---
# Anime Face Diffusion Model ๐จ
A fine-tuned diffusion model for generating high-quality anime faces using DDPM. This model is based on Google's pre-trained `ddpm-celebahq-256` model and fine-tuned on 7,000+ anime face images.
## Model Details
- **Model Type**: Denoising Diffusion Probabilistic Model (DDPM)
- **Base Model**: [google/ddpm-celebahq-256](https://huggingface.co/google/ddpm-celebahq-256)
- **Task**: Unconditional Image Generation (256ร256 anime faces)
- **Training Data**: 7,000+ high-quality anime face images
- **Framework**: ๐งจ Diffusers
- **License**: MIT
## Training Parameters
- **Learning Rate**: 2e-5
- **Epochs**: 15
- **Batch Size**: 4
- **Gradient Accumulation Steps**: 2
- **Training Steps**: ~26,250 (1750 steps/epoch ร 15 epochs)
- **Optimizer**: AdamW
- **Loss**: MSE (Mean Squared Error)
## Usage
### Basic Usage
```python
from diffusers import DDPMPipeline
import torch
# Load the model
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)
# Generate a single image
image = pipeline(num_inference_steps=100).images[0]
image.save("anime_face.png")
```
### Generate Multiple Images
```python
from diffusers import DDPMPipeline
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")
pipeline = pipeline.to("cuda")
# Generate 5 anime faces
images = pipeline(batch_size=5, num_inference_steps=100).images
for i, image in enumerate(images):
image.save(f"anime_face_{i}.png")
```
### Adjust Inference Steps for Quality vs Speed
```python
# Fast generation (fewer steps, less quality)
fast_image = pipeline(num_inference_steps=50).images[0]
# High quality (more steps, slower)
quality_image = pipeline(num_inference_steps=150).images[0]
# Recommended: 100 steps for good balance
balanced_image = pipeline(num_inference_steps=100).images[0]
```
### Use Different Scheduler
```python
from diffusers import DDPMPipeline, DDIMScheduler
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")
# Switch to DDIM for faster sampling
scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
scheduler.set_timesteps(num_inference_steps=50)
pipeline.scheduler = scheduler
fast_image = pipeline().images[0] # Generates in ~50 steps instead of 1000
```
## Model Performance
- **Training Loss**: ~0.0077 (final epoch)
- **Image Resolution**: 256ร256 pixels
- **Inference Speed**: ~30-60 seconds per image (depending on steps)
- **Recommended Inference Steps**: 100 (for best quality)
- **Generated Face Styles**: Wide diversity of anime faces with various:
- Hair colors and styles
- Eye colors and expressions
- Face shapes and features
- Skin tones
## Limitations & Bias
- **Resolution**: Limited to 256ร256 pixels (inherent to model architecture)
- **Style**: Specifically trained on anime faces, may not generate realistic/photorealistic faces
- **Diversity**: Generated faces are limited to patterns in training data
- **Quality Variation**: Face shape clarity depends on inference steps (higher = better)
## Training Details
### Data Preparation
- **Dataset**: Anime Face Dataset (Kaggle)
- **Total Images**: 7,000
- **Selection Method**: Top quality images by file size
- **Preprocessing**:
- Resized to 256ร256
- Random horizontal flip (50% probability)
- Normalized to [-1, 1]
### Fine-tuning Approach
- Started from pre-trained `ddpm-celebahq-256`
- Fine-tuned with low learning rate to preserve general face generation knowledge
- Adapted to anime-specific features (large eyes, stylized features, etc.)
### Training Dynamics
- **Epoch 0-3**: Model adapts from photorealistic to anime style
- **Epoch 4-8**: Loss continues to decrease, anime features solidify
- **Epoch 9+**: Marginal improvements, risk of overfitting
## Ethical Considerations
This model generates synthetic anime faces and should not be used to:
- Create misleading/deceptive content
- Generate non-consensual images of real people
- Violate any local laws or regulations
## Recommended Citation
If you use this model in your research or project, please credit:
- The original DDPM paper
- Google's pre-trained `ddpm-celebahq-256` model
- This fine-tuned adaptation
## Future Improvements
Potential enhancements for future versions:
- Higher resolution (512ร512 or more)
- Conditional generation (text-to-image for anime faces)
- Better diversity through larger training datasets
- Improved training with advanced schedulers or techniques
## Resources
- ๐ [Diffusion Models Class](https://github.com/huggingface/diffusion-models-class)
- ๐ [Diffusers Documentation](https://huggingface.co/docs/diffusers)
- ๐ [DDPM Paper](https://arxiv.org/abs/2006.11239)
- ๐ค [Hugging Face Hub](https://huggingface.co)
---
**Created**: 2025-12-28
**Model Card Contact**: [Your Name/Username]
|