ALI β€” RADIO 2.5-H relighting ControlNet (stage 2)

Reference-based image relighting: given an input image and a reference image, re-render the input's content under the reference's lighting. Built on Stable Diffusion 2.1 ControlNet with a frozen Latent-Intrinsics prior whose intrinsic features are enriched by a frozen RADIO v2.5-H vision foundation model through a trained multi-scale adapter.

This checkpoint is the stage-2 model (check_model_epoch=01.ckpt): the ControlNet + SD cross-attention layers trained with the intrinsics prior frozen, after a stage-1 round that trained the VFM adapter and light decoders.

Training

  • Backbone: radio_v2.5-h (frozen), Latent-Intrinsics UNet (frozen, last.pth.tar).
  • Data: BigTime + Multi-Illumination (MIIW) multi-lighting pairs.
  • Stages: stage 1 (prior/adapter) β†’ stage 2 (ControlNet + cross-attn).

MIIW test metrics (cross-scene, color-corrected)

RMSE PSNR SSIM LPIPS
0.130 18.04 0.718 0.223

Usage

Use with the release code (inference.py --vfm radio_v2.5-h --vae-ckpt <modi-vae>). Requires the Latent-Intrinsics checkpoint and (optionally) the skip-connection "modi-vae" for best detail. See the project repository for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using xyxingx/ALI_RADIO2.5h_st2 1