MoCha / README.md
nielsr's picture
nielsr HF Staff
Add model card and metadata
06c0f3e verified
|
raw
history blame
2.2 kB
metadata
base_model:
  - Wan-AI/Wan2.1-T2V-14B
license: agpl-3.0
pipeline_tag: image-to-video

MoCha: End-to-End Video Character Replacement without Structural Guidance

Paper | Project Page | Github

MoCha is a pioneering framework for controllable video character replacement that allows users to replace a character in a video with a provided identity using only a single arbitrary frame mask.

Unlike prior reconstruction-based methods, MoCha does not require per-frame segmentation masks or explicit structural guidance like skeletons or depth maps. This makes it more robust in complex scenarios involving occlusions, unusual poses, or challenging illumination.

Key Features

  • End-to-End Replacement: Bypasses the need for per-frame masks and structural guidance.
  • Identity Preservation: Uses a condition-aware RoPE and RL-based post-training to enhance facial identity and adapt multi-modal inputs.
  • Robustness: Handles character-object interactions and complex scenarios better than previous state-of-the-art methods.
  • Data Construction: Trained on specialized high-fidelity datasets including UE5-rendered videos and expression-driven portrait animations.

Usage

To use MoCha, please refer to the official GitHub repository for environment setup and inference scripts.

The basic inference workflow requires:

  1. Source Video: The original video with the character to be replaced.
  2. Designation Mask: A mask for the first frame marking the character.
  3. Reference Images: Images of the new character identity.
python inference_mocha.py --data_path path/to/your/data.csv

Citation

If you find MoCha helpful for your research, please cite:

@inproceedings{orange2025mocha,
  title={MoCha: End-to-End Video Character Replacement without Structural Guidance}, 
  author={Zhengbo Xu, Jie Ma, Ziheng Wang, Zhan Peng, Jun Liang, Jing Li},
  journal={arXiv preprint arXiv:2601.08587},
  year={2026},
  url={https://github.com/Orange-3DV-Team/MoCha}
}