uoft-cs/cifar10
Viewer β’ Updated β’ 60k β’ 178k β’ 107
A Denoising Diffusion Probabilistic Model (DDPM) trained on CIFAR-10 for unconditional image generation. The model generates 32x32 RGB images spanning all 10 CIFAR-10 categories without class conditioning.
Implements DDPM (Ho et al., 2020) with a U-Net denoising network featuring self-attention blocks and sinusoidal time embeddings.
Time Embedding:
Residual Blocks (ResBlock):
3x3 Conv2d with GroupNorm and Swish activation1x1 projection on skip path when input/output channels differAttention (AttnBlock):
Downsampling: Strided Conv2d (stride 2)
Upsampling: Nearest-neighbour interpolation + Conv2d
| Parameter | Value |
|---|---|
| Dataset | CIFAR-10 (50,000 images, 32x32 RGB) |
| Epochs | 200 (checkpoint: ckpt_199.pth) |
| Optimizer | AdamW, weight_decay=1e-4 |
| LR Schedule | Cosine annealing + linear warmup (GradualWarmupScheduler) |
| Gradient clipping | Enabled |
| Data augmentation | RandomHorizontalFlip, Normalize(0.5, 0.5, 0.5) |
| Workers | 4 |
| File | Description |
|---|---|
| model.py | U-Net architecture (ResBlock, AttnBlock, DownSample, UpSample, TimeEmbedding) |
| diffusion.py | GaussianDiffusionTrainer + GaussianDiffusionSampler |
| train.py | Training loop |
| scheduler.py | GradualWarmupScheduler |
| main.py | Entry point |
| Checkpoints/ckpt_199.pth | Final model checkpoint |
| SampledImgs/ | Generated sample images |
import torch
from model import UNet
from diffusion import GaussianDiffusionSampler
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = UNet(T=1000, ch=128, ch_mult=[1,2,2,2], attn=[1], num_res_blocks=2, dropout=0.1)
model.load_state_dict(torch.load("Checkpoints/ckpt_199.pth", map_location=device))
model.eval().to(device)
sampler = GaussianDiffusionSampler(beta_1=1e-4, beta_t=0.02, model=model, T=1000).to(device)
with torch.no_grad():
x_T = torch.randn(16, 3, 32, 32, device=device)
samples = sampler(x_T)
samples = (samples.clamp(-1, 1) + 1) / 2 # rescale to [0, 1]
MIT