Pix2Pix — Edges to Handbags

A conditional GAN implementing the pix2pix framework (Isola et al., 2017) for paired image-to-image translation. This model translates edge maps of handbags into realistic photographic renderings.

Model Description

Pix2Pix learns a mapping from input condition images to output target images using an adversarial training objective. The generator is supervised by both an adversarial loss (fooling the discriminator) and an L1 reconstruction loss (staying close to the ground truth).

Architecture

Generator — U-Net

Encoder: stacked Conv2d + BatchNorm2d + ReLU + MaxPool2d blocks, doubling channels at each stage
Decoder: Upsample (bilinear) + Conv2d blocks with skip connections from the corresponding encoder stage
Output: Tanh activation to produce pixel values in [-1, 1]

Discriminator — PatchGAN

Takes the concatenation of condition and real/fake image as input (6 channels)
Classifies whether overlapping image patches are real or generated
Trained with BCEWithLogitsLoss

Loss Functions

L_total = L_adversarial + lambda_recon * L_L1
        = BCEWithLogitsLoss + 200 * L1Loss

Training Details

Parameter	Value
Dataset	edges2handbags
Image resolution	256 x 256
Epochs	50 (checkpoint saved at epoch 59)
Batch size	4
Learning rate	0.0002
Optimizer	Adam (both G and D)
Weight initialization	Normal distribution (mean=0, std=0.02)
lambda_recon (L1 weight)	200

Repository Contents

File	Description
train.py	Training loop
UNet.py	Generator (U-Net) and discriminator architecture
utils.py	Helper functions
dataset.sh	Downloads the edges2handbags dataset
Pix2Pix_Epoch59.pth	Saved generator checkpoint

How to Use

import torch
from UNet import UNet

device = "cuda" if torch.cuda.is_available() else "cpu"
gen = UNet(input_dim=3, real_dim=3).to(device)
checkpoint = torch.load("Pix2Pix_Epoch59.pth", map_location=device)
gen.load_state_dict(checkpoint)
gen.eval()

# condition: edge map tensor of shape (B, 3, 256, 256), normalized to [-1, 1]
with torch.no_grad():
    fake = gen(condition)

References

Isola et al. (2017). Image-to-Image Translation with Conditional Adversarial Networks
Ronneberger et al. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Papers for YashNagraj75/Pix2PIx

Image-to-Image Translation with Conditional Adversarial Networks

Paper • 1611.07004 • Published Nov 21, 2016 • 2

U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 18