license: cc-by-nc-sa-4.0
pipeline_tag: image-to-image
tags:
- image-to-image
- shadow-removal
- albedo-estimation
- relighting
[ICLR 2026] CroCoDiLight: Repurposing Cross-View Completion Encoders for Relighting
Disentangles illumination from scene content in CroCo (Cross-view Completion) latent representations. A learned lighting extractor separates each encoder embedding into a single lighting vector and lighting-invariant patch features, which can then be recombined with target lighting conditions. This enables shadow removal, albedo estimation, lighting transfer, and interpolation, trained on datasets two orders of magnitude smaller than the original CroCo pretraining.
Paper: OpenReview (ICLR 2026)
Code: GitHub
Project Page: alistairfoggin.com/projects/crocodilight/
Pretrained Model Weights
| File | Required for | Description |
|---|---|---|
| Inference | ||
CroCoDiLight.pth |
All inference scripts | Full CroCoDiLight model (includes the CroCo encoder, single-view decoder, lighting extractor, and lighting entangler) |
CroCoDiLight_shadow_mapper.pth |
Shadow removal | Lighting mapper trained for shadow removal |
CroCoDiLight_albedo_mapper.pth |
Albedo estimation | Lighting mapper trained for intrinsic image decomposition |
| Training | ||
CroCoDiLight_decoder.pth |
Training of CroCoDiLight.pth |
The pretrained monocular decoder for the CroCo v2 encoder |
CroCoDiLight.pth is the base model needed by every inference and evaluation script. The mapper weights are only needed for their respective tasks. Lighting transfer, freezing, and interpolation use the base model only.
CroCoDiLight_decoder.pth is not necessary for inference as it is embedded into CroCoDiLight.pth, but can be used as a standalone decoder for the CroCo v2 ViTLarge encoder (which is embedded in the model weights too).
Usage
See the GitHub repository for setup instructions, inference scripts, Gradio demos, training, and evaluation.
Citation BibTeX
@inproceedings{foggin2026crocodilight,
title={{CroCoDiLight}: Repurposing Cross-View Completion Encoders for Relighting},
author={Foggin, Alistair J and Smith, William A P},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=GKvb3HCyNk}
}
License
This project, including its source code and pretrained model weights, is licensed under CC BY-NC-SA 4.0. The pretrained weights are additionally subject to the license terms of the upstream training data documented in the NOTICE file.
Acknowledgements
CroCoDiLight builds on CroCo (Weinzaepfel et al.), licensed under CC BY-NC-SA 4.0 by Naver Corporation.
Model training was performed on the Viking cluster, a high performance compute facility provided by the University of York. We are grateful for computational support from the University of York, IT Services and the Research IT team.