DDPM trained on CIFAR-10
A Denoising Diffusion Probabilistic Model (DDPM) trained on CIFAR-10 using distributed training across 6 V100 GPUs.
Model Description
- Architecture - U-Net with attention (128 base channels)
- Training - 100 epochs on CIFAR-10 (50,000 images)
- Hardware - 3 nodes × 2 V100-16GB GPUs
- Framework - PyTorch DDP
Training Details
| Parameter | Value |
|---|---|
| Batch Size | 64 per GPU (384 effective) |
| Learning Rate | 2e-4 |
| Timesteps | 1000 |
| EMA Decay | 0.9999 |
Usage
import torch
from models import UNet, GaussianDiffusion
model = UNet(in_channels=3, out_channels=3, base_channels=128,
channel_mults=(1,2,2,2), num_res_blocks=2, attention_resolutions=(2,))
model.load_state_dict(torch.load("model_ema.pt"))
model.eval()
diffusion = GaussianDiffusion(timesteps=1000)
samples = diffusion.sample(model, image_size=32, batch_size=16, channels=3)
Training Code
Citation
@misc{darkbird2026,
author = {Arvin Singh},
title = {Darkbird: Distributed Training Examples},
year = {2026},
url = {https://github.com/arvinsingh/Darkbird}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support