File size: 5,777 Bytes
440ef7e
 
724e02f
 
 
440ef7e
 
 
724e02f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
tags:
- autoencoder
- image-colorization
- pytorch
- pytorch_model_hub_mixin
---

# Model Colorization Autoencoder

## Model Description

This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.

### Architecture

- **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
- **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.

The architecture details are as follows:
```python
class ModelColorization(nn.Module, PyTorchModelHubMixin):
    def __init__(self):
        super(ModelColorization, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Flatten(),
            nn.Linear(16*45*45, 4000),
        )
        self.decoder = nn.Sequential(
            nn.Linear(4000, 16 * 45 * 45),
            nn.ReLU(),
            nn.Unflatten(1, (16, 45, 45)),
            nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.Sigmoid()
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

Here's your model card in Markdown format:

md
Copy code
---
tags:
- autoencoder
- image-colorization
- pytorch
- pytorch_model_hub_mixin
---

# Model Colorization Autoencoder

## Model Description

This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.

### Architecture

- **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
- **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.

The architecture details are as follows:
```python
class ModelColorization(nn.Module, PyTorchModelHubMixin):
    def __init__(self):
        super(ModelColorization, self).__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Flatten(),
            nn.Linear(16*45*45, 4000),
        )
        self.decoder = nn.Sequential(
            nn.Linear(4000, 16 * 45 * 45),
            nn.ReLU(),
            nn.Unflatten(1, (16, 45, 45)),
            nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
            nn.Sigmoid()
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

### Training Details
The model was trained using PyTorch for 5 epochs. Here are the training and validation losses observed during the training:

Epoch 1: Training Loss: 0.0063, Validation Loss: 0.0042
Epoch 2: Training Loss: 0.0036, Validation Loss: 0.0035
Epoch 3: Training Loss: 0.0032, Validation Loss: 0.0032
Epoch 4: Training Loss: 0.0030, Validation Loss: 0.0030
Epoch 5: Training Loss: 0.0029, Validation Loss: 0.0030
The model demonstrated continuous improvement in reducing both training and validation loss over the epochs.

### Usage
You can load the model from the Hugging Face Hub using the following code:

```
# Ensure you have the necessary dependencies installed:
pip install torch torchvision transformers

from transformers import AutoModel

model = AutoModel.from_pretrained("sebastiansarasti/AutoEncoderImageColorization")
```