Spaces:
Configuration error
Configuration error
Commit
Β·
c7c6869
1
Parent(s):
b0f868c
new
Browse files
README.md
CHANGED
|
@@ -43,7 +43,7 @@ https://github.com/user-attachments/assets/dc54bc11-48cc-4814-9879-bf2699ee9d1d
|
|
| 43 |
* **[2025/1/23]** Our paper is accepted to [ICLR2025](https://openreview.net/forum?id=SSslAtcPB6)! Welcome to **watch** π this repository for the latest updates.
|
| 44 |
|
| 45 |
|
| 46 |
-
##
|
| 47 |
Our method is tested using cuda12.1, fp16 of accelerator and xformers on a single L40.
|
| 48 |
|
| 49 |
```bash
|
|
@@ -68,23 +68,12 @@ You may download all the base model checkpoints using the following bash command
|
|
| 68 |
bash download_all.sh
|
| 69 |
```
|
| 70 |
|
| 71 |
-
|
| 72 |
|
| 73 |
-
|
| 74 |
-
mkdir annotator/ckpts
|
| 75 |
-
```
|
| 76 |
-
Method 1: Download dwpose models
|
| 77 |
-
|
| 78 |
-
(Note: if your are avaiable to huggingface, other models like depth_zoe etc can be automatically downloaded)
|
| 79 |
-
|
| 80 |
-
Download dwpose model dw-ll_ucoco_384.onnx ([baidu](https://pan.baidu.com/s/1nuBjw-KKSxD_BkpmwXUJiw?pwd=28d7), [google](https://drive.google.com/file/d/12L8E2oAgZy4VACGSK9RaZBZrfgx7VTA2/view?usp=sharing)) and Det model yolox_l.onnx ([baidu](https://pan.baidu.com/s/1fpfIVpv5ypo4c1bUlzkMYQ?pwd=mjdn), [google](https://drive.google.com/file/d/1w9pXC8tT0p9ndMN-CArp1__b2GbzewWI/view?usp=sharing)),
|
| 81 |
-
Then put them into ./annotator/ckpts.
|
| 82 |
-
|
| 83 |
-
Method 2: Download all annotator checkpoints from google or baiduyun (when can not access to huggingface)
|
| 84 |
-
|
| 85 |
-
If you cannot access HuggingFace, you can download all the annotator checkpoints (such as DW-Pose, depth_zoe, depth_midas, and OpenPose, cost around 4G.) from [baidu](https://pan.baidu.com/s/1sgBFLFkdTCDTn4oqHjGb9A?pwd=pdm5) or [google](https://drive.google.com/file/d/1qOsmWshnFMMr8x1HteaTViTSQLh_4rle/view?usp=drive_link)
|
| 86 |
Then extract them into ./annotator/ckpts
|
| 87 |
|
|
|
|
| 88 |
|
| 89 |
## π Prepare all the data
|
| 90 |
|
|
@@ -95,11 +84,12 @@ tar -zxvf videograin_data.tar.gz
|
|
| 95 |
|
| 96 |
## π₯ VideoGrain Editing
|
| 97 |
|
| 98 |
-
|
|
|
|
| 99 |
|
| 100 |
```bash
|
| 101 |
bash test.sh
|
| 102 |
-
|
| 103 |
```
|
| 104 |
|
| 105 |
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
|
@@ -107,12 +97,16 @@ bash test.sh
|
|
| 107 |
```
|
| 108 |
result
|
| 109 |
βββ run_two_man
|
|
|
|
| 110 |
β βββ infer_samples
|
|
|
|
|
|
|
| 111 |
β βββ sample
|
| 112 |
-
β βββ step_0
|
| 113 |
-
β βββ step_0.mp4
|
| 114 |
-
β βββ source_video.mp4
|
| 115 |
-
|
|
|
|
| 116 |
```
|
| 117 |
|
| 118 |
</details>
|
|
|
|
| 43 |
* **[2025/1/23]** Our paper is accepted to [ICLR2025](https://openreview.net/forum?id=SSslAtcPB6)! Welcome to **watch** π this repository for the latest updates.
|
| 44 |
|
| 45 |
|
| 46 |
+
## π» Setup Environment
|
| 47 |
Our method is tested using cuda12.1, fp16 of accelerator and xformers on a single L40.
|
| 48 |
|
| 49 |
```bash
|
|
|
|
| 68 |
bash download_all.sh
|
| 69 |
```
|
| 70 |
|
| 71 |
+
<details><summary>Click for ControlNet annotator weights (if you can not access to huggingface)</summary>
|
| 72 |
|
| 73 |
+
You can download all the annotator checkpoints (such as DW-Pose, depth_zoe, depth_midas, and OpenPose, cost around 4G.) from [baidu](https://pan.baidu.com/s/1sgBFLFkdTCDTn4oqHjGb9A?pwd=pdm5) or [google](https://drive.google.com/file/d/1qOsmWshnFMMr8x1HteaTViTSQLh_4rle/view?usp=drive_link)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
Then extract them into ./annotator/ckpts
|
| 75 |
|
| 76 |
+
</details>
|
| 77 |
|
| 78 |
## π Prepare all the data
|
| 79 |
|
|
|
|
| 84 |
|
| 85 |
## π₯ VideoGrain Editing
|
| 86 |
|
| 87 |
+
### Inference
|
| 88 |
+
VideoGrain is a training-free framework. To run the inference script, use the following command:
|
| 89 |
|
| 90 |
```bash
|
| 91 |
bash test.sh
|
| 92 |
+
or accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/running_spider_polar_sunglass.yaml
|
| 93 |
```
|
| 94 |
|
| 95 |
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
|
|
|
| 97 |
```
|
| 98 |
result
|
| 99 |
βββ run_two_man
|
| 100 |
+
β βββ control # control conditon
|
| 101 |
β βββ infer_samples
|
| 102 |
+
β βββ input # the input video frames
|
| 103 |
+
β βββ masked_video.mp4 # check whether edit regions are accuratedly covered
|
| 104 |
β βββ sample
|
| 105 |
+
β βββ step_0 # result image folder
|
| 106 |
+
β βββ step_0.mp4 # result video
|
| 107 |
+
β βββ source_video.mp4 # the input video
|
| 108 |
+
β βββ visualization_denoise # cross attention weight
|
| 109 |
+
β βββ sd_study # cluster inversion feature
|
| 110 |
```
|
| 111 |
|
| 112 |
</details>
|
annotator/dwpose/__pycache__/wholebody.cpython-310.pyc
CHANGED
|
Binary files a/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc and b/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc differ
|
|
|
annotator/dwpose/wholebody.py
CHANGED
|
@@ -1,15 +1,32 @@
|
|
| 1 |
import cv2
|
| 2 |
import numpy as np
|
| 3 |
-
|
|
|
|
| 4 |
import onnxruntime as ort
|
| 5 |
from .onnxdet import inference_detector
|
| 6 |
from .onnxpose import inference_pose
|
|
|
|
|
|
|
| 7 |
|
| 8 |
class Wholebody:
|
| 9 |
def __init__(self):
|
| 10 |
device = 'cuda:0'
|
| 11 |
providers = ['CPUExecutionProvider'
|
| 12 |
] if device == 'cpu' else ['CUDAExecutionProvider']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
onnx_det = 'annotator/ckpts/yolox_l.onnx'
|
| 14 |
onnx_pose = 'annotator/ckpts/dw-ll_ucoco_384.onnx'
|
| 15 |
|
|
|
|
| 1 |
import cv2
|
| 2 |
import numpy as np
|
| 3 |
+
import os
|
| 4 |
+
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
|
| 5 |
import onnxruntime as ort
|
| 6 |
from .onnxdet import inference_detector
|
| 7 |
from .onnxpose import inference_pose
|
| 8 |
+
from annotator.util import annotator_ckpts_path
|
| 9 |
+
|
| 10 |
|
| 11 |
class Wholebody:
|
| 12 |
def __init__(self):
|
| 13 |
device = 'cuda:0'
|
| 14 |
providers = ['CPUExecutionProvider'
|
| 15 |
] if device == 'cpu' else ['CUDAExecutionProvider']
|
| 16 |
+
|
| 17 |
+
remote_dw_pose_path = "https://huggingface.co/sxela/dwpose_ckpts/resolve/main/dw-ll_ucoco_384.onnx"
|
| 18 |
+
remote_yolox_path = "https://huggingface.co/sxela/dwpose_ckpts/resolve/main/yolox_l.onnx"
|
| 19 |
+
|
| 20 |
+
dw_pose_path = os.path.join(annotator_ckpts_path, "dw-ll_ucoco_384.onnx")
|
| 21 |
+
yolox_path = os.path.join(annotator_ckpts_path, "yolox_l.onnx")
|
| 22 |
+
|
| 23 |
+
if not os.path.exists(dw_pose_path):
|
| 24 |
+
from basicsr.utils.download_util import load_file_from_url
|
| 25 |
+
load_file_from_url(remote_dw_pose_path, model_dir=annotator_ckpts_path)
|
| 26 |
+
if not os.path.exists(yolox_path):
|
| 27 |
+
from basicsr.utils.download_util import load_file_from_url
|
| 28 |
+
load_file_from_url(remote_yolox_path, model_dir=annotator_ckpts_path)
|
| 29 |
+
|
| 30 |
onnx_det = 'annotator/ckpts/yolox_l.onnx'
|
| 31 |
onnx_pose = 'annotator/ckpts/dw-ll_ucoco_384.onnx'
|
| 32 |
|
config/instance_level/running_two_man/running_3cls_polar_spider_vis_weight.yaml
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
pretrained_model_path: "./ckpt/stable-diffusion-v1-5"
|
| 2 |
-
logdir: ./result/run_two_man/instance_level/
|
| 3 |
|
| 4 |
dataset_config:
|
| 5 |
path: "data/run_two_man/run_two_man_fr2"
|
|
|
|
| 1 |
pretrained_model_path: "./ckpt/stable-diffusion-v1-5"
|
| 2 |
+
logdir: ./result/run_two_man/instance_level/3cls_spider_polar_vis_cross_attn
|
| 3 |
|
| 4 |
dataset_config:
|
| 5 |
path: "data/run_two_man/run_two_man_fr2"
|
requirements.txt
CHANGED
|
@@ -65,4 +65,5 @@ scikit-learn==1.2.2
|
|
| 65 |
nltk==3.8.1
|
| 66 |
timm==0.6.7
|
| 67 |
scikit-image==0.24.0
|
| 68 |
-
gdown==5.1.0
|
|
|
|
|
|
| 65 |
nltk==3.8.1
|
| 66 |
timm==0.6.7
|
| 67 |
scikit-image==0.24.0
|
| 68 |
+
gdown==5.1.0
|
| 69 |
+
basicsr-fixed
|
video_diffusion/common/__pycache__/image_util.cpython-310.pyc
CHANGED
|
Binary files a/video_diffusion/common/__pycache__/image_util.cpython-310.pyc and b/video_diffusion/common/__pycache__/image_util.cpython-310.pyc differ
|
|
|