ayushshah
/

imagecolorization

image colorization

Model card Files Files and versions

ayushshah commited on Oct 23, 2025

Commit

9fc4849

·

verified ·

1 Parent(s): ff4f6fc

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ A UNet architecture, utilizing transfer learning by using a pretrained ResNet-34
 Try the model on [Google Colab](https://colab.research.google.com/drive/1JYbSLtDuFSw2NYe-YW-kZHLNkt4-v7jd) or [Huggingface space](https://huggingface.co/spaces/ayushshah/imagecolorization).
 <img src="https://cdn-uploads.huggingface.co/production/uploads/6318256d212fce5a3cde0fe3/To7FVLusBz1kl8g9HhPIf.png" width="800px"/>
-The model takes a 1x224x224 L tensor as input and outputs 2x224x224 ab channels. The decoder has been trained from scratch. The encoder (ResNet-34) was initially frozen for the decoder to adapt to the task, then it was progressively unfreezed layer by layer. Initial layers were not unfreezed, only deeper layers were fine-tuned. Read various research papers. It took 20+ hours of training on Google Colab and Kaggle T4 GPUs to train the model.
 There are no dedicated datasets for image colorisation, hence I curated my own dataset and used it to train the model. The COCO 2017 dataset was filtered to remove grayscale images, heavily filtered images, and other artifacts not suitable for training a natural colorization model. Also the images were center-cropped and resized to 224x224. The dataset can be found [here](https://huggingface.co/datasets/ayushshah/coco-2017-image-colorization-224). This repository contains the model weights and the UNet architecture to load the weights into.

 Try the model on [Google Colab](https://colab.research.google.com/drive/1JYbSLtDuFSw2NYe-YW-kZHLNkt4-v7jd) or [Huggingface space](https://huggingface.co/spaces/ayushshah/imagecolorization).
 <img src="https://cdn-uploads.huggingface.co/production/uploads/6318256d212fce5a3cde0fe3/To7FVLusBz1kl8g9HhPIf.png" width="800px"/>
+The model takes a 1x224x224 L tensor as input and outputs 2x224x224 ab channels. The decoder has been trained from scratch. The encoder (ResNet-34) was initially frozen for the decoder to adapt to the task, then it was progressively unfrozen layer by layer. Initial layers were not unfrozen, only deeper layers were fine-tuned. Read various research papers. It took 20+ hours of training on Google Colab and Kaggle T4 GPUs to train the model.
 There are no dedicated datasets for image colorisation, hence I curated my own dataset and used it to train the model. The COCO 2017 dataset was filtered to remove grayscale images, heavily filtered images, and other artifacts not suitable for training a natural colorization model. Also the images were center-cropped and resized to 224x224. The dataset can be found [here](https://huggingface.co/datasets/ayushshah/coco-2017-image-colorization-224). This repository contains the model weights and the UNet architecture to load the weights into.