--- license: mit tags: - pytorch - diffusers - unconditional-image-generation - diffusion-models - anime - anime-faces - ddpm --- # Anime Face Diffusion Model ๐ŸŽจ A fine-tuned diffusion model for generating high-quality anime faces using DDPM. This model is based on Google's pre-trained `ddpm-celebahq-256` model and fine-tuned on 7,000+ anime face images. ## Model Details - **Model Type**: Denoising Diffusion Probabilistic Model (DDPM) - **Base Model**: [google/ddpm-celebahq-256](https://huggingface.co/google/ddpm-celebahq-256) - **Task**: Unconditional Image Generation (256ร—256 anime faces) - **Training Data**: 7,000+ high-quality anime face images - **Framework**: ๐Ÿงจ Diffusers - **License**: MIT ## Training Parameters - **Learning Rate**: 2e-5 - **Epochs**: 15 - **Batch Size**: 4 - **Gradient Accumulation Steps**: 2 - **Training Steps**: ~26,250 (1750 steps/epoch ร— 15 epochs) - **Optimizer**: AdamW - **Loss**: MSE (Mean Squared Error) ## Usage ### Basic Usage ```python from diffusers import DDPMPipeline import torch # Load the model pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") device = "cuda" if torch.cuda.is_available() else "cpu" pipeline = pipeline.to(device) # Generate a single image image = pipeline(num_inference_steps=100).images[0] image.save("anime_face.png") ``` ### Generate Multiple Images ```python from diffusers import DDPMPipeline pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") pipeline = pipeline.to("cuda") # Generate 5 anime faces images = pipeline(batch_size=5, num_inference_steps=100).images for i, image in enumerate(images): image.save(f"anime_face_{i}.png") ``` ### Adjust Inference Steps for Quality vs Speed ```python # Fast generation (fewer steps, less quality) fast_image = pipeline(num_inference_steps=50).images[0] # High quality (more steps, slower) quality_image = pipeline(num_inference_steps=150).images[0] # Recommended: 100 steps for good balance balanced_image = pipeline(num_inference_steps=100).images[0] ``` ### Use Different Scheduler ```python from diffusers import DDPMPipeline, DDIMScheduler pipeline = DDPMPipeline.from_pretrained("abcd2019/Anime-face-generation") # Switch to DDIM for faster sampling scheduler = DDIMScheduler.from_config(pipeline.scheduler.config) scheduler.set_timesteps(num_inference_steps=50) pipeline.scheduler = scheduler fast_image = pipeline().images[0] # Generates in ~50 steps instead of 1000 ``` ## Model Performance - **Training Loss**: ~0.0077 (final epoch) - **Image Resolution**: 256ร—256 pixels - **Inference Speed**: ~30-60 seconds per image (depending on steps) - **Recommended Inference Steps**: 100 (for best quality) - **Generated Face Styles**: Wide diversity of anime faces with various: - Hair colors and styles - Eye colors and expressions - Face shapes and features - Skin tones ## Limitations & Bias - **Resolution**: Limited to 256ร—256 pixels (inherent to model architecture) - **Style**: Specifically trained on anime faces, may not generate realistic/photorealistic faces - **Diversity**: Generated faces are limited to patterns in training data - **Quality Variation**: Face shape clarity depends on inference steps (higher = better) ## Training Details ### Data Preparation - **Dataset**: Anime Face Dataset (Kaggle) - **Total Images**: 7,000 - **Selection Method**: Top quality images by file size - **Preprocessing**: - Resized to 256ร—256 - Random horizontal flip (50% probability) - Normalized to [-1, 1] ### Fine-tuning Approach - Started from pre-trained `ddpm-celebahq-256` - Fine-tuned with low learning rate to preserve general face generation knowledge - Adapted to anime-specific features (large eyes, stylized features, etc.) ### Training Dynamics - **Epoch 0-3**: Model adapts from photorealistic to anime style - **Epoch 4-8**: Loss continues to decrease, anime features solidify - **Epoch 9+**: Marginal improvements, risk of overfitting ## Ethical Considerations This model generates synthetic anime faces and should not be used to: - Create misleading/deceptive content - Generate non-consensual images of real people - Violate any local laws or regulations ## Recommended Citation If you use this model in your research or project, please credit: - The original DDPM paper - Google's pre-trained `ddpm-celebahq-256` model - This fine-tuned adaptation ## Future Improvements Potential enhancements for future versions: - Higher resolution (512ร—512 or more) - Conditional generation (text-to-image for anime faces) - Better diversity through larger training datasets - Improved training with advanced schedulers or techniques ## Resources - ๐Ÿ“š [Diffusion Models Class](https://github.com/huggingface/diffusion-models-class) - ๐Ÿ“– [Diffusers Documentation](https://huggingface.co/docs/diffusers) - ๐Ÿ“„ [DDPM Paper](https://arxiv.org/abs/2006.11239) - ๐Ÿค— [Hugging Face Hub](https://huggingface.co) --- **Created**: 2025-12-28 **Model Card Contact**: [Your Name/Username]