AEmotionStudio
/

NormalCrafter

StableVideoDiffusionPipeline

normal-estimation

Model card Files Files and versions

NormalCrafter / README.md

AEmotionStudio's picture

Add README and MIT LICENSE

44ab288 verified 2 days ago

|

history blame contribute delete

2.6 kB

	---
	license: apache-2.0
	library_name: diffusers
	pipeline_tag: image-to-video
	tags:
	- normal-estimation
	- video
	- diffusion
	- svd
	---

	# NormalCrafter — Video Normal Map Estimation

	Mirror of [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter) hosted by [AEmotionStudio](https://huggingface.co/AEmotionStudio) for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA).

	## Model Description

	NormalCrafter generates temporally consistent surface normal maps from video using a Stable Video Diffusion (SVD) backbone fine-tuned for normal estimation. Unlike image-based methods (e.g., Marigold), NormalCrafter operates natively on video sequences, producing smooth frame-to-frame normals without flickering.

	## Key Features

	- Video-native: Processes temporal sequences for coherent normals across frames
	- SVD backbone: Built on `stabilityai/stable-video-diffusion-img2vid-xt`
	- High resolution: Supports up to 1024px inference
	- Apache-2.0 Licensed: Free for commercial and personal use

	## Model Files

	\| File \| Size \| Description \|
	\|------\|------\|-------------\|
	\| `unet/diffusion_pytorch_model.safetensors` \| 3.05 GB \| Fine-tuned UNet for normal estimation \|
	\| `image_encoder/model.fp16.safetensors` \| 1.26 GB \| CLIP image encoder (fp16) \|
	\| `vae/diffusion_pytorch_model.safetensors` \| 196 MB \| VAE decoder \|

	## Usage in ComfyUI-FFMPEGA

	NormalCrafter is available as:
	- Standalone skill: `normalcrafter` in the FFMPEGA agent
	- No-LLM mode: Select `normalcrafter` in the agent node dropdown
	- AI Relighting: Enable "Use NormalCrafter" in the Video Editor's Relight panel for physically-based relighting

	## Citation

	```bibtex
	@article{normalcrafter2024,
	title={NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors},
	author={Yanrui Bin and Wenbo Hu and Haoyuan Wang and Xinya Chen and Bing Wang},
	year={2024}
	}
	```

	## License

	- Model weights (this repo): Apache-2.0 — matching the upstream [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter) HuggingFace repo. See [LICENSE](LICENSE).
	- Source code: MIT — as published at [Binyr/NormalCrafter](https://github.com/Binyr/NormalCrafter) on GitHub.

	Both licenses are permissive and allow commercial use.

	## Links

	- Paper: [NormalCrafter](https://github.com/Binyr/NormalCrafter)
	- Upstream weights: [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter)
	- ComfyUI-FFMPEGA: [AEmotionStudio/ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)