ldiex
/

NovaEdit

@@ -1,10 +1,10 @@
 ---
 license: apache-2.0
 tags:
 - video-editing
 - diffusion
 - wan
-pipeline_tag: video-to-video
 ---
 <div align="center">
@@ -13,7 +13,7 @@ pipeline_tag: video-to-video
 **CVPR 2026**
-[![GitHub](https://img.shields.io/badge/GitHub-NovaEdit-black?logo=github)](https://github.com/WeChatCV/NovaEdit)
 </div>
@@ -27,11 +27,46 @@ NOVA is a pair-free video editing model built on **WAN 1.3B Fun InP**. It uses s
 - **Sparse keyframe control**: provide one or more edited keyframes
 - **Optional coarse mask** for improved editing accuracy
 ## Usage
-See the [GitHub repository](https://github.com/WeChatCV/NovaEdit) for installation, inference, training, and the interactive Gradio demo.
 ## Acknowledgements
 - [KlingTeam/ReCamMaster](https://github.com/KlingTeam/ReCamMaster)
-- [zibojia/MiniMax-Remover](https://github.com/zibojia/MiniMax-Remover)

 ---
 license: apache-2.0
+pipeline_tag: image-to-video
 tags:
 - video-editing
 - diffusion
 - wan
 ---
 <div align="center">
 **CVPR 2026**
+[![arXiv](https://img.shields.io/badge/arXiv-2603.02802-b31b1b)](https://arxiv.org/abs/2603.02802) [![GitHub](https://img.shields.io/badge/GitHub-NovaEdit-black?logo=github)](https://github.com/WeChatCV/NovaEdit)
 </div>
 - **Sparse keyframe control**: provide one or more edited keyframes
 - **Optional coarse mask** for improved editing accuracy
+The framework consists of a sparse branch providing semantic guidance through user-edited keyframes and a dense branch that incorporates motion and texture information from the original video to maintain high fidelity and coherence.
 ## Usage
+For full installation and training instructions, please visit the [GitHub repository](https://github.com/WeChatCV/NovaEdit).
+### Inference via CLI
+You can run inference using the `infer_nova.py` script. Below is an example for single GPU inference:
+```bash
+python infer_nova.py \
+  --dataset_path ./example_videos \
+  --metadata_file_name metadata.csv \
+  --ckpt_path /path/to/checkpoints/stepXXX.ckpt \
+  --output_path ./inference_results \
+  --text_encoder_path /path/to/models_t5_umt5-xxl-enc-bf16.pth \
+  --image_encoder_path /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
+  --vae_path /path/to/Wan2.1_VAE.pth \
+  --dit_path /path/to/diffusion_pytorch_model.safetensors \
+  --num_samples 5 \
+  --num_inference_steps 50 \
+  --num_frames 81 \
+  --height 480 \
+  --width 832 \
+  --first_only
+```
+## Citation
+```bibtex
+@article{pan2026nova,
+  title={NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing},
+  author={Tianlin Pan and Jiayi Dai and Chenpu Yuan and Zhengyao Lv and Binxin Yang and Hubery Yin and Chen Li and Jing Lyu and Caifeng Shan and Chenyang Si},
+  journal={arXiv preprint arXiv:2603.02802},
+  year={2026}
+}
+```
 ## Acknowledgements
 - [KlingTeam/ReCamMaster](https://github.com/KlingTeam/ReCamMaster)
+- [zibojia/MiniMax-Remover](https://github.com/zibojia/MiniMax-Remover)