Improve model card and add metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +37 -2
README.md CHANGED
@@ -1,9 +1,44 @@
1
  ---
2
  license: other
3
  license_name: stabilityai-community-license
4
- license_link: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md
 
 
5
  ---
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ## License
8
 
9
- This model is licensed under the **Stability AI Community License**. By using this model, you agree to the terms outlined in the [license agreement](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md).
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
  license_name: stabilityai-community-license
4
+ license_link: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/blob/main/LICENSE.md
5
+ library_name: diffusers
6
+ pipeline_tag: image-to-image
7
  ---
8
 
9
+ # VideoMaMa: Mask-Guided Video Matting via Generative Prior
10
+
11
+ [**Sangbeom Lim**](https://sites.google.com/view/sangbeomlim/home) · [**Seoung Wug Oh**](https://sites.google.com/view/seoungwugoh) · [**Jiahui Huang**](https://gabriel-huang.github.io/) · [**Heeji Yoon**](https://yoon-heez.github.io/) · [**Seungryong Kim**](https://cvlab.kaist.ac.kr/members/faculty) · [**Joon-Young Lee**](https://joonyoung-cv.github.io)
12
+
13
+ [[Paper](https://huggingface.co/papers/2601.14255)] [[Project Page](https://cvlab-kaist.github.io/VideoMaMa/)] [[GitHub](https://github.com/cvlab-kaist/VideoMaMa)] [[Gradio Demo](https://huggingface.co/spaces/SammyLim/VideoMaMa)]
14
+
15
+ VideoMaMa (Video Mask-to-Matte Model) is a framework that converts coarse segmentation masks into pixel-accurate alpha mattes by leveraging pretrained video diffusion models. It demonstrates strong zero-shot generalization to real-world footage, even though it is trained solely on synthetic data.
16
+
17
+ ## Inference
18
+
19
+ To use VideoMaMa for inference, you can use the script provided in the [official repository](https://github.com/cvlab-kaist/VideoMaMa):
20
+
21
+ ```bash
22
+ python inference_onestep_folder.py \
23
+ --base_model_path "stabilityai/stable-video-diffusion-img2vid-xt" \
24
+ --unet_checkpoint_path "SammyLim/VideoMaMa" \
25
+ --image_root_path "/path/to/your/images" \
26
+ --mask_root_path "/path/to/your/masks" \
27
+ --output_dir "./output" \
28
+ --keep_aspect_ratio
29
+ ```
30
+
31
  ## License
32
 
33
+ The VideoMaMa model checkpoints (specifically `unet/*` and `dino_projection_mlp.pth`) are subject to the **Stability AI Community License**. By using this model, you agree to the terms outlined in the [license agreement](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid/blob/main/LICENSE.md).
34
+
35
+ ## Citation
36
+
37
+ ```bibtex
38
+ @article{lim2026videomama,
39
+ title={VideoMaMa: Mask-Guided Video Matting via Generative Prior},
40
+ author={Lim, Sangbeom and Oh, Seoung Wug and Huang, Jiahui and Yoon, Heeji and Kim, Seungryong and Lee, Joon-Young},
41
+ journal={arXiv preprint arXiv:2601.14255},
42
+ year={2026}
43
+ }
44
+ ```