Improve model card and add paper link
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,10 +1,10 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
| 3 |
tags:
|
| 4 |
- video-editing
|
| 5 |
- diffusion
|
| 6 |
- wan
|
| 7 |
-
pipeline_tag: video-to-video
|
| 8 |
---
|
| 9 |
|
| 10 |
<div align="center">
|
|
@@ -13,7 +13,7 @@ pipeline_tag: video-to-video
|
|
| 13 |
|
| 14 |
**CVPR 2026**
|
| 15 |
|
| 16 |
-
[](https://github.com/WeChatCV/NovaEdit)
|
| 17 |
|
| 18 |
</div>
|
| 19 |
|
|
@@ -27,11 +27,46 @@ NOVA is a pair-free video editing model built on **WAN 1.3B Fun InP**. It uses s
|
|
| 27 |
- **Sparse keyframe control**: provide one or more edited keyframes
|
| 28 |
- **Optional coarse mask** for improved editing accuracy
|
| 29 |
|
|
|
|
|
|
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
## Acknowledgements
|
| 35 |
|
| 36 |
- [KlingTeam/ReCamMaster](https://github.com/KlingTeam/ReCamMaster)
|
| 37 |
-
- [zibojia/MiniMax-Remover](https://github.com/zibojia/MiniMax-Remover)
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: image-to-video
|
| 4 |
tags:
|
| 5 |
- video-editing
|
| 6 |
- diffusion
|
| 7 |
- wan
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
<div align="center">
|
|
|
|
| 13 |
|
| 14 |
**CVPR 2026**
|
| 15 |
|
| 16 |
+
[](https://arxiv.org/abs/2603.02802) [](https://github.com/WeChatCV/NovaEdit)
|
| 17 |
|
| 18 |
</div>
|
| 19 |
|
|
|
|
| 27 |
- **Sparse keyframe control**: provide one or more edited keyframes
|
| 28 |
- **Optional coarse mask** for improved editing accuracy
|
| 29 |
|
| 30 |
+
The framework consists of a sparse branch providing semantic guidance through user-edited keyframes and a dense branch that incorporates motion and texture information from the original video to maintain high fidelity and coherence.
|
| 31 |
+
|
| 32 |
## Usage
|
| 33 |
|
| 34 |
+
For full installation and training instructions, please visit the [GitHub repository](https://github.com/WeChatCV/NovaEdit).
|
| 35 |
+
|
| 36 |
+
### Inference via CLI
|
| 37 |
+
|
| 38 |
+
You can run inference using the `infer_nova.py` script. Below is an example for single GPU inference:
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
python infer_nova.py \
|
| 42 |
+
--dataset_path ./example_videos \
|
| 43 |
+
--metadata_file_name metadata.csv \
|
| 44 |
+
--ckpt_path /path/to/checkpoints/stepXXX.ckpt \
|
| 45 |
+
--output_path ./inference_results \
|
| 46 |
+
--text_encoder_path /path/to/models_t5_umt5-xxl-enc-bf16.pth \
|
| 47 |
+
--image_encoder_path /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
|
| 48 |
+
--vae_path /path/to/Wan2.1_VAE.pth \
|
| 49 |
+
--dit_path /path/to/diffusion_pytorch_model.safetensors \
|
| 50 |
+
--num_samples 5 \
|
| 51 |
+
--num_inference_steps 50 \
|
| 52 |
+
--num_frames 81 \
|
| 53 |
+
--height 480 \
|
| 54 |
+
--width 832 \
|
| 55 |
+
--first_only
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
## Citation
|
| 59 |
+
|
| 60 |
+
```bibtex
|
| 61 |
+
@article{pan2026nova,
|
| 62 |
+
title={NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing},
|
| 63 |
+
author={Tianlin Pan and Jiayi Dai and Chenpu Yuan and Zhengyao Lv and Binxin Yang and Hubery Yin and Chen Li and Jing Lyu and Caifeng Shan and Chenyang Si},
|
| 64 |
+
journal={arXiv preprint arXiv:2603.02802},
|
| 65 |
+
year={2026}
|
| 66 |
+
}
|
| 67 |
+
```
|
| 68 |
|
| 69 |
## Acknowledgements
|
| 70 |
|
| 71 |
- [KlingTeam/ReCamMaster](https://github.com/KlingTeam/ReCamMaster)
|
| 72 |
+
- [zibojia/MiniMax-Remover](https://github.com/zibojia/MiniMax-Remover)
|