multishot / README.md
PencilHu's picture
Upload folder using huggingface_hub
85752bc verified
|
Raw
History Blame Contribute Delete
1.29 kB
# Multi View
Multi-view face as condition for human-centric video generation.
## ✔️ TODO List
- [x] Base code for single-shot video generation
- [x] Spliting RoPE for video and reference images
- [ ] Face selecting router
- [ ] Supporting multi-shot video generation
- [ ] Four level shot RoPE
- [ ] Inter-shot self-attention and frame-pack based intra-shot attention
## 🚀 Training
```bash
bash train.sh
```
## 🚀 Inference
```bash
bash test.sh
```
## ⚙️ Configuration
```bash
YAML:
train_args:
max_checkpoints_to_keep: 3
resume_from_checkpoint: True
seed: 42
save_steps: 150
save_epoches: 1
batch_size: 8
visual_log_project_name: Wan2.2_5B-Multi_view-normal_rope_384_640-3ref
output_path: /root/paddlejob/workspace/qizipeng/baidu/personal-code/Multi-view/multi_view/ckpts
local_model_path: /root/paddlejob/workspace/qizipeng/wanx_pretrainedmodels
zero_face_ratio: 0.1
split_rope: False
split1: False
split2: False
split3: False
infer_args:
infer_step: 1350
epoch_id: 17
dataset_args:
base_path: /root/paddlejob/workspace/qizipeng/baidu/personal-code/Multi-view/multi_view/datasets/merged_wangpan_artgrid_taobao_visionchina_123rf_nasuyun_xinpianchang_disk.json
height: 384
width: 640
num_frames: 81
ref_num: 3
```
---