|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- pytorch |
|
|
- diffusers |
|
|
- unconditional-image-generation |
|
|
- diffusion-models |
|
|
- anime |
|
|
- anime-faces |
|
|
- ddpm |
|
|
--- |
|
|
|
|
|
# Anime Face Diffusion Model ๐จ |
|
|
|
|
|
A fine-tuned diffusion model for generating high-quality anime faces using DDPM. This model is based on Google's pre-trained `ddpm-celebahq-256` model and fine-tuned on 7,000+ anime face images. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model Type**: Denoising Diffusion Probabilistic Model (DDPM) |
|
|
- **Base Model**: [google/ddpm-celebahq-256](https://huggingface.co/google/ddpm-celebahq-256) |
|
|
- **Task**: Unconditional Image Generation (256ร256 anime faces) |
|
|
- **Training Data**: 7,000+ high-quality anime face images |
|
|
- **Framework**: ๐งจ Diffusers |
|
|
- **License**: MIT |
|
|
|
|
|
## Training Parameters |
|
|
|
|
|
- **Learning Rate**: 2e-5 |
|
|
- **Epochs**: 15 |
|
|
- **Batch Size**: 4 |
|
|
- **Gradient Accumulation Steps**: 2 |
|
|
- **Training Steps**: ~26,250 (1750 steps/epoch ร 15 epochs) |
|
|
- **Optimizer**: AdamW |
|
|
- **Loss**: MSE (Mean Squared Error) |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Basic Usage |
|
|
|
|
|
```python |
|
|
from diffusers import DDPMPipeline |
|
|
import torch |
|
|
|
|
|
# Load the model |
|
|
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
pipeline = pipeline.to(device) |
|
|
|
|
|
# Generate a single image |
|
|
image = pipeline(num_inference_steps=100).images[0] |
|
|
image.save("anime_face.png") |
|
|
``` |
|
|
|
|
|
### Generate Multiple Images |
|
|
|
|
|
```python |
|
|
from diffusers import DDPMPipeline |
|
|
|
|
|
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") |
|
|
pipeline = pipeline.to("cuda") |
|
|
|
|
|
# Generate 5 anime faces |
|
|
images = pipeline(batch_size=5, num_inference_steps=100).images |
|
|
|
|
|
for i, image in enumerate(images): |
|
|
image.save(f"anime_face_{i}.png") |
|
|
``` |
|
|
|
|
|
### Adjust Inference Steps for Quality vs Speed |
|
|
|
|
|
```python |
|
|
# Fast generation (fewer steps, less quality) |
|
|
fast_image = pipeline(num_inference_steps=50).images[0] |
|
|
|
|
|
# High quality (more steps, slower) |
|
|
quality_image = pipeline(num_inference_steps=150).images[0] |
|
|
|
|
|
# Recommended: 100 steps for good balance |
|
|
balanced_image = pipeline(num_inference_steps=100).images[0] |
|
|
``` |
|
|
|
|
|
### Use Different Scheduler |
|
|
|
|
|
```python |
|
|
from diffusers import DDPMPipeline, DDIMScheduler |
|
|
|
|
|
pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") |
|
|
|
|
|
# Switch to DDIM for faster sampling |
|
|
scheduler = DDIMScheduler.from_config(pipeline.scheduler.config) |
|
|
scheduler.set_timesteps(num_inference_steps=50) |
|
|
pipeline.scheduler = scheduler |
|
|
|
|
|
fast_image = pipeline().images[0] # Generates in ~50 steps instead of 1000 |
|
|
``` |
|
|
|
|
|
## Model Performance |
|
|
|
|
|
- **Training Loss**: ~0.0077 (final epoch) |
|
|
- **Image Resolution**: 256ร256 pixels |
|
|
- **Inference Speed**: ~30-60 seconds per image (depending on steps) |
|
|
- **Recommended Inference Steps**: 100 (for best quality) |
|
|
- **Generated Face Styles**: Wide diversity of anime faces with various: |
|
|
- Hair colors and styles |
|
|
- Eye colors and expressions |
|
|
- Face shapes and features |
|
|
- Skin tones |
|
|
|
|
|
## Limitations & Bias |
|
|
|
|
|
- **Resolution**: Limited to 256ร256 pixels (inherent to model architecture) |
|
|
- **Style**: Specifically trained on anime faces, may not generate realistic/photorealistic faces |
|
|
- **Diversity**: Generated faces are limited to patterns in training data |
|
|
- **Quality Variation**: Face shape clarity depends on inference steps (higher = better) |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Data Preparation |
|
|
- **Dataset**: Anime Face Dataset (Kaggle) |
|
|
- **Total Images**: 7,000 |
|
|
- **Selection Method**: Top quality images by file size |
|
|
- **Preprocessing**: |
|
|
- Resized to 256ร256 |
|
|
- Random horizontal flip (50% probability) |
|
|
- Normalized to [-1, 1] |
|
|
|
|
|
### Fine-tuning Approach |
|
|
- Started from pre-trained `ddpm-celebahq-256` |
|
|
- Fine-tuned with low learning rate to preserve general face generation knowledge |
|
|
- Adapted to anime-specific features (large eyes, stylized features, etc.) |
|
|
|
|
|
### Training Dynamics |
|
|
- **Epoch 0-3**: Model adapts from photorealistic to anime style |
|
|
- **Epoch 4-8**: Loss continues to decrease, anime features solidify |
|
|
- **Epoch 9+**: Marginal improvements, risk of overfitting |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
This model generates synthetic anime faces and should not be used to: |
|
|
- Create misleading/deceptive content |
|
|
- Generate non-consensual images of real people |
|
|
- Violate any local laws or regulations |
|
|
|
|
|
## Recommended Citation |
|
|
|
|
|
If you use this model in your research or project, please credit: |
|
|
- The original DDPM paper |
|
|
- Google's pre-trained `ddpm-celebahq-256` model |
|
|
- This fine-tuned adaptation |
|
|
|
|
|
## Future Improvements |
|
|
|
|
|
Potential enhancements for future versions: |
|
|
- Higher resolution (512ร512 or more) |
|
|
- Conditional generation (text-to-image for anime faces) |
|
|
- Better diversity through larger training datasets |
|
|
- Improved training with advanced schedulers or techniques |
|
|
|
|
|
## Resources |
|
|
|
|
|
- ๐ [Diffusion Models Class](https://github.com/huggingface/diffusion-models-class) |
|
|
- ๐ [Diffusers Documentation](https://huggingface.co/docs/diffusers) |
|
|
- ๐ [DDPM Paper](https://arxiv.org/abs/2006.11239) |
|
|
- ๐ค [Hugging Face Hub](https://huggingface.co) |
|
|
|
|
|
--- |
|
|
|
|
|
**Created**: 2025-12-28 |
|
|
|
|
|
**Model Card Contact**: [Your Name/Username] |
|
|
|