NormalCrafter — Video Normal Map Estimation

Mirror of Yanrui95/NormalCrafter hosted by AEmotionStudio for use with ComfyUI-FFMPEGA.

Model Description

NormalCrafter generates temporally consistent surface normal maps from video using a Stable Video Diffusion (SVD) backbone fine-tuned for normal estimation. Unlike image-based methods (e.g., Marigold), NormalCrafter operates natively on video sequences, producing smooth frame-to-frame normals without flickering.

Key Features

Video-native: Processes temporal sequences for coherent normals across frames
SVD backbone: Built on stabilityai/stable-video-diffusion-img2vid-xt
High resolution: Supports up to 1024px inference
Apache-2.0 Licensed: Free for commercial and personal use

Model Files

File	Size	Description
`unet/diffusion_pytorch_model.safetensors`	3.05 GB	Fine-tuned UNet for normal estimation
`image_encoder/model.fp16.safetensors`	1.26 GB	CLIP image encoder (fp16)
`vae/diffusion_pytorch_model.safetensors`	196 MB	VAE decoder

Usage in ComfyUI-FFMPEGA

NormalCrafter is available as:

Standalone skill: normalcrafter in the FFMPEGA agent
No-LLM mode: Select normalcrafter in the agent node dropdown
AI Relighting: Enable "Use NormalCrafter" in the Video Editor's Relight panel for physically-based relighting

Citation

@article{normalcrafter2024,
  title={NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors},
  author={Yanrui Bin and Wenbo Hu and Haoyuan Wang and Xinya Chen and Bing Wang},
  year={2024}
}

License

Model weights (this repo): Apache-2.0 — matching the upstream Yanrui95/NormalCrafter HuggingFace repo. See LICENSE.
Source code: MIT — as published at Binyr/NormalCrafter on GitHub.

Both licenses are permissive and allow commercial use.

AEmotionStudio
/

NormalCrafter