Ditun
U-shaped transformer model in CIELAB color space. The model reconstructs the input image.
- LAB input, RGB output
- 8 channel latent
The upsample layers generate images (at different resolution):
- heatmap from labels (as in CLIP retrieval)
- lightness
- saturation
- edge detection
- RGB image
- optional, one of the Marigold outputs
The model prioritized color accuracy for both digital and traditional artworks.
Datasets
- Pixiv_1024