CameraNoise Logo

[ICML 2026] CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping

Haoyu Zhao, Jiaxi Gu, Haoran Chen, Qingping Zheng, Yeying Jin, Hongyi Yang, Junqi Cheng, Yuang Zhang, Zenghui Lu, Huan Yu, Jie Jiang, Peng Shu, Zuxuan Wu, Yu-Gang Jiang

Fudan University, Tencent.

Project Page GitHub Hugging Face

CameraNoise-I2V Model

This repository hosts the CameraNoise-I2V model weights for image-to-video generation with faithful camera-motion control.

CameraNoise Pipeline

Given a reference image and a reference video, CameraNoise estimates the camera motion from the reference video, converts it into temporally coherent CameraNoise, and uses it to guide Wan2.1-I2V generation. This allows the generated video to follow the reference camera trajectory while preserving visual quality and temporal consistency.

Model Files

The model weights are organized by resolution:

CameraNoise-I2V/
  1024x576/
    cameranoise_i2v_wan2.1_1024x576_lora.safetensors
  i2v_demo_results/
    demo1
    demo2
    demo3
    ...
    demo10

Installation

Please use the official CameraNoise GitHub repository for inference:

git clone https://github.com/gulucaptain/CameraNoise
cd CameraNoise
pip install -r requirements.txt

The following checkpoints are required:

VGGT checkpoint
QwenVL checkpoint
Wan2.1-I2V-14B-720P checkpoint
CameraNoise-I2V LoRA checkpoint

Prepare Inputs

Each demo should be placed in a separate folder under outputs/. Put the reference image and reference video in the inputs/ folder:

outputs/demo1/
  inputs/
    example.jpg       # reference image
    example.mp4       # reference video for camera motion

The script will automatically generate the image caption, camera motion conditions, CameraNoise, and final video.

Inference

576 x 1024 Model

python cameranoise_i2v.py \
  --demo-dir outputs/demo1 \
  --vggt-ckpt /path/to/VGGT-1B \
  --cameranoise-config cameranoise_warping/configs/default.yaml \
  --qwenvl-model-path /path/to/Qwen2-VL-7B-Instruct \
  --model-root /path/to/Wan2.1-I2V-14B-720P \
  --lora-path /path/to/CameraNoise-I2V/1024x576/cameranoise_lora.safetensors \
  --height 576 \
  --width 1024 \
  --frames 49 \
  --cfg 3.5 \
  --device cuda \
  --output-type single

CameraNoise Resolution

The spatial size of CameraNoise is automatically inferred from the output video resolution:

cameranoise_downscale_size = [height // 8, width // 8]

Recommended settings:

576 x 1024 -> [72, 128]
768 x 768  -> [96, 96]

You can also manually specify the CameraNoise size:

--cameranoise-downscale-size 72,128

Outputs

After inference, the generated files will be saved under the demo folder:

outputs/demo1/
  conditions/
    caption.txt
    noises/
      *_noises.npy
      *_visualization.mp4
  samples/
    demo1.mp4
    demo1_compare.mp4
  manifest.json

Links

Citation

@inproceedings{zhao2026cameranoise,
  title     = {CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping},
  author    = {Zhao, Haoyu and Gu, Jiaxi and Chen, Haoran and Zheng, Qingping and Jin, Yeying and Yang, Hongyi and Cheng, Junqi and Zhang, Yuang and Lu, Zenghui and Yu, Huan and Jiang, Jie and Shu, Peng and Wu, Zuxuan and Jiang, Yu-Gang},
  booktitle = {Proceedings of the Forty-third International Conference on Machine Learning},
  year      = {2026}
}

## Disclaimer

This model is released for research purposes. Please refer to the GitHub repository for the complete codebase, detailed installation instructions, and inference scripts.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support