Update README.md

36da8c1 verified 19 days ago

3.75 kB

license: cc-by-nc-sa-4.0
pipeline_tag: image-to-image
tags:
  - image-to-image
  - shadow-removal
  - albedo-estimation
  - relighting

[ICLR 2026] CroCoDiLight: Repurposing Cross-View Completion Encoders for Relighting

Disentangles illumination from scene content in CroCo (Cross-view Completion) latent representations. A learned lighting extractor separates each encoder embedding into a single lighting vector and lighting-invariant patch features, which can then be recombined with target lighting conditions. This enables shadow removal, albedo estimation, lighting transfer, and interpolation, trained on datasets two orders of magnitude smaller than the original CroCo pretraining.

Paper: OpenReview (ICLR 2026)
Code: GitHub
Project Page: alistairfoggin.com/projects/crocodilight/

Pretrained Model Weights

File	Required for	Description
Inference
`CroCoDiLight.pth`	All inference scripts	Full CroCoDiLight model (includes the CroCo encoder, single-view decoder, lighting extractor, and lighting entangler)
`CroCoDiLight_shadow_mapper.pth`	Shadow removal	Lighting mapper trained for shadow removal
`CroCoDiLight_albedo_mapper.pth`	Albedo estimation	Lighting mapper trained for intrinsic image decomposition
Training
`CroCoDiLight_decoder.pth`	Training of `CroCoDiLight.pth`	The pretrained monocular decoder for the CroCo v2 encoder

CroCoDiLight.pth is the base model needed by every inference and evaluation script. The mapper weights are only needed for their respective tasks. Lighting transfer, freezing, and interpolation use the base model only.

CroCoDiLight_decoder.pth is not necessary for inference as it is embedded into CroCoDiLight.pth, but can be used as a standalone decoder for the CroCo v2 ViTLarge encoder (which is embedded in the model weights too).

Usage

See the GitHub repository for setup instructions, inference scripts, Gradio demos, training, and evaluation.

Citation BibTeX

@inproceedings{foggin2026crocodilight,
  title={{CroCoDiLight}: Repurposing Cross-View Completion Encoders for Relighting},
  author={Foggin, Alistair J and Smith, William A P},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=GKvb3HCyNk}
}

License

This project, including its source code and pretrained model weights, is licensed under CC BY-NC-SA 4.0. The pretrained weights are additionally subject to the license terms of the upstream training data documented in the NOTICE file.

Acknowledgements

CroCoDiLight builds on CroCo (Weinzaepfel et al.), licensed under CC BY-NC-SA 4.0 by Naver Corporation.

Model training was performed on the Viking cluster, a high performance compute facility provided by the University of York. We are grateful for computational support from the University of York, IT Services and the Research IT team.