File size: 5,036 Bytes
344d3c1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
license: mit
tags:
- pytorch
- diffusers
- unconditional-image-generation
- diffusion-models
- anime
- anime-faces
- ddpm
---

# Anime Face Diffusion Model ๐ŸŽจ

A fine-tuned diffusion model for generating high-quality anime faces using DDPM. This model is based on Google's pre-trained `ddpm-celebahq-256` model and fine-tuned on 7,000+ anime face images.

## Model Details

- **Model Type**: Denoising Diffusion Probabilistic Model (DDPM)
- **Base Model**: [google/ddpm-celebahq-256](https://huggingface.co/google/ddpm-celebahq-256)
- **Task**: Unconditional Image Generation (256ร—256 anime faces)
- **Training Data**: 7,000+ high-quality anime face images
- **Framework**: ๐Ÿงจ Diffusers
- **License**: MIT

## Training Parameters

- **Learning Rate**: 2e-5
- **Epochs**: 15
- **Batch Size**: 4
- **Gradient Accumulation Steps**: 2
- **Training Steps**: ~26,250 (1750 steps/epoch ร— 15 epochs)
- **Optimizer**: AdamW
- **Loss**: MSE (Mean Squared Error)

## Usage

### Basic Usage

```python
from diffusers import DDPMPipeline
import torch

# Load the model
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)

# Generate a single image
image = pipeline(num_inference_steps=100).images[0]
image.save("anime_face.png")
```

### Generate Multiple Images

```python
from diffusers import DDPMPipeline

pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")
pipeline = pipeline.to("cuda")

# Generate 5 anime faces
images = pipeline(batch_size=5, num_inference_steps=100).images

for i, image in enumerate(images):
    image.save(f"anime_face_{i}.png")
```

### Adjust Inference Steps for Quality vs Speed

```python
# Fast generation (fewer steps, less quality)
fast_image = pipeline(num_inference_steps=50).images[0]

# High quality (more steps, slower)
quality_image = pipeline(num_inference_steps=150).images[0]

# Recommended: 100 steps for good balance
balanced_image = pipeline(num_inference_steps=100).images[0]
```

### Use Different Scheduler

```python
from diffusers import DDPMPipeline, DDIMScheduler

pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation")

# Switch to DDIM for faster sampling
scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
scheduler.set_timesteps(num_inference_steps=50)
pipeline.scheduler = scheduler

fast_image = pipeline().images[0]  # Generates in ~50 steps instead of 1000
```

## Model Performance

- **Training Loss**: ~0.0077 (final epoch)
- **Image Resolution**: 256ร—256 pixels
- **Inference Speed**: ~30-60 seconds per image (depending on steps)
- **Recommended Inference Steps**: 100 (for best quality)
- **Generated Face Styles**: Wide diversity of anime faces with various:
  - Hair colors and styles
  - Eye colors and expressions
  - Face shapes and features
  - Skin tones

## Limitations & Bias

- **Resolution**: Limited to 256ร—256 pixels (inherent to model architecture)
- **Style**: Specifically trained on anime faces, may not generate realistic/photorealistic faces
- **Diversity**: Generated faces are limited to patterns in training data
- **Quality Variation**: Face shape clarity depends on inference steps (higher = better)

## Training Details

### Data Preparation
- **Dataset**: Anime Face Dataset (Kaggle)
- **Total Images**: 7,000
- **Selection Method**: Top quality images by file size
- **Preprocessing**: 
  - Resized to 256ร—256
  - Random horizontal flip (50% probability)
  - Normalized to [-1, 1]

### Fine-tuning Approach
- Started from pre-trained `ddpm-celebahq-256`
- Fine-tuned with low learning rate to preserve general face generation knowledge
- Adapted to anime-specific features (large eyes, stylized features, etc.)

### Training Dynamics
- **Epoch 0-3**: Model adapts from photorealistic to anime style
- **Epoch 4-8**: Loss continues to decrease, anime features solidify
- **Epoch 9+**: Marginal improvements, risk of overfitting

## Ethical Considerations

This model generates synthetic anime faces and should not be used to:
- Create misleading/deceptive content
- Generate non-consensual images of real people
- Violate any local laws or regulations

## Recommended Citation

If you use this model in your research or project, please credit:
- The original DDPM paper
- Google's pre-trained `ddpm-celebahq-256` model
- This fine-tuned adaptation

## Future Improvements

Potential enhancements for future versions:
- Higher resolution (512ร—512 or more)
- Conditional generation (text-to-image for anime faces)
- Better diversity through larger training datasets
- Improved training with advanced schedulers or techniques

## Resources

- ๐Ÿ“š [Diffusion Models Class](https://github.com/huggingface/diffusion-models-class)
- ๐Ÿ“– [Diffusers Documentation](https://huggingface.co/docs/diffusers)
- ๐Ÿ“„ [DDPM Paper](https://arxiv.org/abs/2006.11239)
- ๐Ÿค— [Hugging Face Hub](https://huggingface.co)

---

**Created**: 2025-12-28

**Model Card Contact**: [Your Name/Username]