File size: 5,777 Bytes
440ef7e 724e02f 440ef7e 724e02f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
tags:
- autoencoder
- image-colorization
- pytorch
- pytorch_model_hub_mixin
---
# Model Colorization Autoencoder
## Model Description
This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.
### Architecture
- **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
- **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.
The architecture details are as follows:
```python
class ModelColorization(nn.Module, PyTorchModelHubMixin):
def __init__(self):
super(ModelColorization, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(16),
nn.Flatten(),
nn.Linear(16*45*45, 4000),
)
self.decoder = nn.Sequential(
nn.Linear(4000, 16 * 45 * 45),
nn.ReLU(),
nn.Unflatten(1, (16, 45, 45)),
nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)
def forward(self, x):
x = self.encoder(x)
x = self.decoder(x)
return x
Here's your model card in Markdown format:
md
Copy code
---
tags:
- autoencoder
- image-colorization
- pytorch
- pytorch_model_hub_mixin
---
# Model Colorization Autoencoder
## Model Description
This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.
### Architecture
- **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
- **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.
The architecture details are as follows:
```python
class ModelColorization(nn.Module, PyTorchModelHubMixin):
def __init__(self):
super(ModelColorization, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(16),
nn.Flatten(),
nn.Linear(16*45*45, 4000),
)
self.decoder = nn.Sequential(
nn.Linear(4000, 16 * 45 * 45),
nn.ReLU(),
nn.Unflatten(1, (16, 45, 45)),
nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)
def forward(self, x):
x = self.encoder(x)
x = self.decoder(x)
return x
### Training Details
The model was trained using PyTorch for 5 epochs. Here are the training and validation losses observed during the training:
Epoch 1: Training Loss: 0.0063, Validation Loss: 0.0042
Epoch 2: Training Loss: 0.0036, Validation Loss: 0.0035
Epoch 3: Training Loss: 0.0032, Validation Loss: 0.0032
Epoch 4: Training Loss: 0.0030, Validation Loss: 0.0030
Epoch 5: Training Loss: 0.0029, Validation Loss: 0.0030
The model demonstrated continuous improvement in reducing both training and validation loss over the epochs.
### Usage
You can load the model from the Hugging Face Hub using the following code:
```
# Ensure you have the necessary dependencies installed:
pip install torch torchvision transformers
from transformers import AutoModel
model = AutoModel.from_pretrained("sebastiansarasti/AutoEncoderImageColorization")
``` |