Initial upload of Calcifications model

Browse files

Files changed (6) hide show

Calc.png +0 -0
README.md +36 -0
critic_size_512_599.pth +3 -0
generator_size_512_599.pth +3 -0
progan_model.py +207 -0
requirements.txt +10 -0

Calc.png ADDED Viewed

README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+---
+    license: mit
+    tags:
+    - gan
+    - progan
+    - generative-ai
+    - medical-imaging
+    - pytorch
+    ---
+    # ProGAN-Mammography-Calcifications
+    ## 🖼️ Model Description
+    This model is an implementation of **Progressive Growing of GANs (ProGAN)**, meticulously trained to generate medical images of mammograms with the presence of calcifications, mainly microcalcifications.. Its objective is to synthesize realistic images for data augmentation, research, or studying complex patterns in mammograms.
+    > This model is part of a broader research effort on the application of GANs in medical mammography imaging.
+    ## ⚙️ Architecture Details
+    * **GAN Type:** Progressive Growing of GANs (ProGAN)
+    * **Generator:** The generator's architecture is defined in `progan_model.py`. This file includes the `Generator` class necessary to instantiate the model.
+    * **Generator Weights:** The main generator weights are found in the file `generator_size_512_599.pth`. This is the checkpoint with the highest resolution and training epoch achieved.
+    * **Critic/Discriminator Weights:** (Optional) The critic/discriminator weights are found in the file `critic_size_512_599.pth`.
+    ## 📊 Training Dataset
+    The model was trained using the following dataset:
+    > This model was trained exclusively on a subset of the 'VinDr-Mammogram' dataset, consisting of mammograms showcasing **confirmed calcifications**. The VinDr-Mammogram dataset was meticulously curated and labeled by experienced radiologists, and its labeling scheme is unique. This model was developed as part of a Bachelor's Final Project (TFG) at the University of Extremadura (UEX).
+    It is recommended to review the original dataset documentation for more details on its composition and characteristics.
+    ## 🚀 How to Use This Model
+    ### Requirements
+    Make sure you have the following Python libraries installed:
+    ```bash
+    pip install torch
+    pip install huggingface_hub

critic_size_512_599.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f0183bc36d13e70bf3bc178f589cde43babc51c492f0ad3297168f32a298489b
+size 305232851

generator_size_512_599.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd7c5df518472061333e940b51ee033547d782e76f8f073b63005ade28c43eed
+size 276898727

progan_model.py ADDED Viewed

	@@ -0,0 +1,207 @@

+"""
+Implementation of ProGAN generator and discriminator with the key
+attributions from the paper. We have tried to make the implementation
+compact but a goal is also to keep it readable and understandable.
+Specifically the key points implemented are:
+1) Progressive growing (of model and layers)
+2) Minibatch std on Discriminator
+3) Normalization with PixelNorm
+4) Equalized Learning Rate (here I cheated and only did it on Conv layers)
+"""
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from math import log2
+"""
+Factors is used in Discrmininator and Generator for how much
+the channels should be multiplied and expanded for each layer,
+so specifically the first 5 layers the channels stay the same,
+whereas when we increase the img_size (towards the later layers)
+we decrease the number of chanels by 1/2, 1/4, etc.
+"""
+factors = [1, 1, 1, 1, 1 / 2, 1 / 4, 1 / 8, 1 / 16, 1 / 32]
+class WSConv2d(nn.Module):
+    """
+    Weight scaled Conv2d (Equalized Learning Rate)
+    Note that input is multiplied rather than changing weights
+    this will have the same result.
+    """
+    def __init__(
+        self, in_channels, out_channels, kernel_size=3, stride=1, padding=1, gain=2
+    ):
+        super(WSConv2d, self).__init__()
+        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
+        self.scale = (gain / (in_channels * (kernel_size ** 2))) ** 0.5
+        self.bias = self.conv.bias
+        self.conv.bias = None
+        nn.init.normal_(self.conv.weight)
+        nn.init.zeros_(self.bias)
+    def forward(self, x):
+        return self.conv(x) * self.scale + self.bias.view(1, self.bias.shape[0], 1, 1)
+class PixelNorm(nn.Module):
+    def __init__(self):
+        super(PixelNorm, self).__init__()
+        self.epsilon = 1e-8
+    def forward(self, x):
+        return x / torch.sqrt(torch.mean(x ** 2, dim=1, keepdim=True) + self.epsilon)
+class ConvBlock(nn.Module):
+    def __init__(self, in_channels, out_channels, use_pixelnorm=True):
+        super(ConvBlock, self).__init__()
+        self.use_pn = use_pixelnorm
+        self.conv1 = WSConv2d(in_channels, out_channels)
+        self.conv2 = WSConv2d(out_channels, out_channels)
+        self.leaky = nn.LeakyReLU(0.2)
+        self.pn = PixelNorm()
+    def forward(self, x):
+        x = self.leaky(self.conv1(x))
+        x = self.pn(x) if self.use_pn else x
+        x = self.leaky(self.conv2(x))
+        x = self.pn(x) if self.use_pn else x
+        return x
+class Generator(nn.Module):
+    def __init__(self, z_dim, in_channels, img_channels=1):
+        super(Generator, self).__init__()
+        # initial takes 1x1 -> 4x4
+        self.initial = nn.Sequential(
+            PixelNorm(),
+            nn.ConvTranspose2d(z_dim, in_channels, 4, 1, 0),
+            nn.LeakyReLU(0.2),
+            WSConv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1),
+            nn.LeakyReLU(0.2),
+            PixelNorm(),
+        )
+        self.initial_rgb = WSConv2d(
+            in_channels, img_channels, kernel_size=1, stride=1, padding=0
+        )
+        self.prog_blocks, self.rgb_layers = (
+            nn.ModuleList([]),
+            nn.ModuleList([self.initial_rgb]),
+        )
+        for i in range(
+            len(factors) - 1
+        ):  # -1 to prevent index error because of factors[i+1]
+            conv_in_c = int(in_channels * factors[i])
+            conv_out_c = int(in_channels * factors[i + 1])
+            self.prog_blocks.append(ConvBlock(conv_in_c, conv_out_c))
+            self.rgb_layers.append(
+                WSConv2d(conv_out_c, img_channels, kernel_size=1, stride=1, padding=0)
+            )
+    def fade_in(self, alpha, upscaled, generated):
+        # alpha should be scalar within [0, 1], and upscale.shape == generated.shape
+        return torch.tanh(alpha * generated + (1 - alpha) * upscaled)
+    def forward(self, x, alpha, steps):
+        out = self.initial(x)
+        if steps == 0:
+            return self.initial_rgb(out)
+        for step in range(steps):
+            upscaled = F.interpolate(out, scale_factor=2, mode="nearest")
+            out = self.prog_blocks[step](upscaled)
+        final_upscaled = self.rgb_layers[steps - 1](upscaled)
+        final_out = self.rgb_layers[steps](out)
+        return self.fade_in(alpha, final_upscaled, final_out)
+class Discriminator(nn.Module):
+    def __init__(self, z_dim, in_channels, img_channels=1):
+        super(Discriminator, self).__init__()
+        self.prog_blocks, self.rgb_layers = nn.ModuleList([]), nn.ModuleList([])
+        self.leaky = nn.LeakyReLU(0.2)
+        for i in range(len(factors) - 1, 0, -1):
+            conv_in = int(in_channels * factors[i])
+            conv_out = int(in_channels * factors[i - 1])
+            self.prog_blocks.append(ConvBlock(conv_in, conv_out, use_pixelnorm=False))
+            self.rgb_layers.append(
+                WSConv2d(img_channels, conv_in, kernel_size=1, stride=1, padding=0)
+            )
+        # perhaps confusing name "initial_rgb" this is just the RGB layer for 4x4 input size
+        # did this to "mirror" the generator initial_rgb
+        self.initial_rgb = WSConv2d(
+            img_channels, in_channels, kernel_size=1, stride=1, padding=0
+        )
+        self.rgb_layers.append(self.initial_rgb)
+        self.avg_pool = nn.AvgPool2d(
+            kernel_size=2, stride=2
+        )  # down sampling using avg pool
+        # this is the block for 4x4 input size
+        self.final_block = nn.Sequential(
+            # +1 to in_channels because we concatenate from MiniBatch std
+            WSConv2d(in_channels + 1, in_channels, kernel_size=3, padding=1),
+            nn.LeakyReLU(0.2),
+            WSConv2d(in_channels, in_channels, kernel_size=4, padding=0, stride=1),
+            nn.LeakyReLU(0.2),
+            WSConv2d(
+                in_channels, 1, kernel_size=1, padding=0, stride=1
+            ),  # we use this instead of linear layer
+        )
+    def fade_in(self, alpha, downscaled, out):
+        """Used to fade in downscaled using avg pooling and output from CNN"""
+        # alpha should be scalar within [0, 1], and upscale.shape == generated.shape
+        return alpha * out + (1 - alpha) * downscaled
+    def minibatch_std(self, x):
+        batch_statistics = torch.std(x, dim=0, unbiased=False).mean()
+        batch_statistics = batch_statistics.repeat(x.shape[0], 1, x.shape[2], x.shape[3])
+        # we take the std for each example (across all channels, and pixels) then we repeat it
+        # for a single channel and concatenate it with the image. In this way the discriminator
+        # will get information about the variation in the batch/image
+        return torch.cat([x, batch_statistics], dim=1)
+    def forward(self, x, alpha, steps):
+        # where we should start in the list of prog_blocks, maybe a bit confusing but
+        # the last is for the 4x4. So example let's say steps=1, then we should start
+        # at the second to last because input_size will be 8x8. If steps==0 we just
+        # use the final block
+        cur_step = len(self.prog_blocks) - steps
+        # convert from rgb as initial step, this will depend on
+        # the image size (each will have it's on rgb layer)
+        out = self.leaky(self.rgb_layers[cur_step](x))
+        if steps == 0:  # i.e, image is 4x4
+            out = self.minibatch_std(out)
+            return self.final_block(out).view(out.shape[0], -1)
+        # because prog_blocks might change the channels, for down scale we use rgb_layer
+        # from previous/smaller size which in our case correlates to +1 in the indexing
+        downscaled = self.leaky(self.rgb_layers[cur_step + 1](self.avg_pool(x)))
+        out = self.avg_pool(self.prog_blocks[cur_step](out))
+        # the fade_in is done first between the downscaled and the input
+        # this is opposite from the generator
+        out = self.fade_in(alpha, downscaled, out)
+        for step in range(cur_step + 1, len(self.prog_blocks)):
+            out = self.prog_blocks[step](out)
+            out = self.avg_pool(out)
+        out = self.minibatch_std(out)
+        return self.final_block(out).view(out.shape[0], -1)

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+Pillow==10.4.0
+customtkinter==5.2.2
+matplotlib==3.9.2
+numpy==2.1.3
+opencv-python==4.11.0
+scipy==1.14.1
+torch==2.5.1+cu124
+torchmetrics==1.7.1
+torchvision==0.20.1+cu124
+tqdm==4.67.0