Improve model card and add metadata
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,9 +1,44 @@
|
|
| 1 |
---
|
| 2 |
license: other
|
| 3 |
license_name: stabilityai-community-license
|
| 4 |
-
license_link: https://huggingface.co/stabilityai/stable-diffusion-
|
|
|
|
|
|
|
| 5 |
---
|
| 6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
## License
|
| 8 |
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: other
|
| 3 |
license_name: stabilityai-community-license
|
| 4 |
+
license_link: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/blob/main/LICENSE.md
|
| 5 |
+
library_name: diffusers
|
| 6 |
+
pipeline_tag: image-to-image
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# VideoMaMa: Mask-Guided Video Matting via Generative Prior
|
| 10 |
+
|
| 11 |
+
[**Sangbeom Lim**](https://sites.google.com/view/sangbeomlim/home) · [**Seoung Wug Oh**](https://sites.google.com/view/seoungwugoh) · [**Jiahui Huang**](https://gabriel-huang.github.io/) · [**Heeji Yoon**](https://yoon-heez.github.io/) · [**Seungryong Kim**](https://cvlab.kaist.ac.kr/members/faculty) · [**Joon-Young Lee**](https://joonyoung-cv.github.io)
|
| 12 |
+
|
| 13 |
+
[[Paper](https://huggingface.co/papers/2601.14255)] [[Project Page](https://cvlab-kaist.github.io/VideoMaMa/)] [[GitHub](https://github.com/cvlab-kaist/VideoMaMa)] [[Gradio Demo](https://huggingface.co/spaces/SammyLim/VideoMaMa)]
|
| 14 |
+
|
| 15 |
+
VideoMaMa (Video Mask-to-Matte Model) is a framework that converts coarse segmentation masks into pixel-accurate alpha mattes by leveraging pretrained video diffusion models. It demonstrates strong zero-shot generalization to real-world footage, even though it is trained solely on synthetic data.
|
| 16 |
+
|
| 17 |
+
## Inference
|
| 18 |
+
|
| 19 |
+
To use VideoMaMa for inference, you can use the script provided in the [official repository](https://github.com/cvlab-kaist/VideoMaMa):
|
| 20 |
+
|
| 21 |
+
```bash
|
| 22 |
+
python inference_onestep_folder.py \
|
| 23 |
+
--base_model_path "stabilityai/stable-video-diffusion-img2vid-xt" \
|
| 24 |
+
--unet_checkpoint_path "SammyLim/VideoMaMa" \
|
| 25 |
+
--image_root_path "/path/to/your/images" \
|
| 26 |
+
--mask_root_path "/path/to/your/masks" \
|
| 27 |
+
--output_dir "./output" \
|
| 28 |
+
--keep_aspect_ratio
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
## License
|
| 32 |
|
| 33 |
+
The VideoMaMa model checkpoints (specifically `unet/*` and `dino_projection_mlp.pth`) are subject to the **Stability AI Community License**. By using this model, you agree to the terms outlined in the [license agreement](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/blob/main/LICENSE.md).
|
| 34 |
+
|
| 35 |
+
## Citation
|
| 36 |
+
|
| 37 |
+
```bibtex
|
| 38 |
+
@article{lim2026videomama,
|
| 39 |
+
title={VideoMaMa: Mask-Guided Video Matting via Generative Prior},
|
| 40 |
+
author={Lim, Sangbeom and Oh, Seoung Wug and Huang, Jiahui and Yoon, Heeji and Kim, Seungryong and Lee, Joon-Young},
|
| 41 |
+
journal={arXiv preprint arXiv:2601.14255},
|
| 42 |
+
year={2026}
|
| 43 |
+
}
|
| 44 |
+
```
|