studyOverflow commited on
Commit
7ddcafe
·
verified ·
1 Parent(s): 73048bf

Add files using upload-large-folder tool

Browse files
wandb/run-20260124_003511-fnfy86iu/files/config.yaml ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _wandb:
2
+ value:
3
+ cli_version: 0.18.5
4
+ m: []
5
+ python_version: 3.10.19
6
+ t:
7
+ "1":
8
+ - 1
9
+ - 11
10
+ - 41
11
+ - 49
12
+ - 55
13
+ - 63
14
+ - 71
15
+ - 83
16
+ - 98
17
+ "2":
18
+ - 1
19
+ - 11
20
+ - 41
21
+ - 49
22
+ - 55
23
+ - 63
24
+ - 71
25
+ - 83
26
+ - 98
27
+ "3":
28
+ - 13
29
+ - 23
30
+ - 55
31
+ "4": 3.10.19
32
+ "5": 0.18.5
33
+ "6": 4.46.1
34
+ "8":
35
+ - 5
36
+ "12": 0.18.5
37
+ "13": linux-x86_64
38
+ allow_tf32:
39
+ value: true
40
+ logdir:
41
+ value: logs
42
+ mixed_precision:
43
+ value: bf16
44
+ num_checkpoint_limit:
45
+ value: 5
46
+ num_epochs:
47
+ value: 300
48
+ pretrained:
49
+ value:
50
+ model: ./data/StableDiffusion
51
+ revision: main
52
+ prompt_fn:
53
+ value: imagenet_animals
54
+ resume_from:
55
+ value: ""
56
+ reward_fn:
57
+ value: hpsv2
58
+ run_name:
59
+ value: 2026.01.24_00.34.56
60
+ sample:
61
+ value:
62
+ batch_size: 1
63
+ eta: 1
64
+ guidance_scale: 5
65
+ num_batches_per_epoch: 2
66
+ num_steps: 50
67
+ save_freq:
68
+ value: 20
69
+ seed:
70
+ value: 42
71
+ train:
72
+ value:
73
+ adam_beta1: 0.9
74
+ adam_beta2: 0.999
75
+ adam_epsilon: 1e-08
76
+ adam_weight_decay: 0.0001
77
+ adv_clip_max: 5
78
+ batch_size: 1
79
+ cfg: true
80
+ clip_range: 0.0001
81
+ gradient_accumulation_steps: 1
82
+ learning_rate: 1e-05
83
+ max_grad_norm: 1
84
+ num_inner_epochs: 1
85
+ timestep_fraction: 1
86
+ use_8bit_adam: false
87
+ use_lora:
88
+ value: false
wandb/run-20260124_003511-fnfy86iu/files/output.log ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ I0124 00:35:12.769941 130053895333696 train_g2rpo_sd_merge.py:510]
2
+ allow_tf32: true
3
+ logdir: logs
4
+ mixed_precision: bf16
5
+ num_checkpoint_limit: 5
6
+ num_epochs: 300
7
+ pretrained:
8
+ model: ./data/StableDiffusion
9
+ revision: main
10
+ prompt_fn: imagenet_animals
11
+ prompt_fn_kwargs: {}
12
+ resume_from: ''
13
+ reward_fn: hpsv2
14
+ run_name: 2026.01.24_00.34.56
15
+ sample:
16
+ batch_size: 1
17
+ eta: 1.0
18
+ guidance_scale: 5.0
19
+ num_batches_per_epoch: 2
20
+ num_steps: 50
21
+ save_freq: 20
22
+ seed: 42
23
+ train:
24
+ adam_beta1: 0.9
25
+ adam_beta2: 0.999
26
+ adam_epsilon: 1.0e-08
27
+ adam_weight_decay: 0.0001
28
+ adv_clip_max: 5
29
+ batch_size: 1
30
+ cfg: true
31
+ clip_range: 0.0001
32
+ gradient_accumulation_steps: 1
33
+ learning_rate: 1.0e-05
34
+ max_grad_norm: 1.0
35
+ num_inner_epochs: 1
36
+ timestep_fraction: 1.0
37
+ use_8bit_adam: false
38
+ use_lora: false
39
+
40
+ Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.10it/s]
41
+ Traceback (most recent call last):
42
+ File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 920, in <module>
43
+ app.run(main)
44
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 316, in run
45
+ _run_main(main, args)
46
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 261, in _run_main
47
+ sys.exit(main(argv))
48
+ File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 597, in main
49
+ unet, optimizer = accelerator.prepare(unet, optimizer)
50
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1350, in prepare
51
+ result = tuple(
52
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1351, in <genexpr>
53
+ self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
54
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1226, in _prepare_one
55
+ return self.prepare_model(obj, device_placement=device_placement)
56
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1477, in prepare_model
57
+ model = torch.nn.parallel.DistributedDataParallel(
58
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 858, in __init__
59
+ _verify_param_shape_across_processes(self.process_group, parameters)
60
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/distributed/utils.py", line 281, in _verify_param_shape_across_processes
61
+ return dist._verify_params_across_processes(process_group, tensors, logger)
62
+ RuntimeError: DDP expects same model across all ranks, but Rank 0 has 686 params, while rank 1 has inconsistent 0 params.
63
+ [rank0]: Traceback (most recent call last):
64
+ [rank0]: File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 920, in <module>
65
+ [rank0]: app.run(main)
66
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 316, in run
67
+ [rank0]: _run_main(main, args)
68
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 261, in _run_main
69
+ [rank0]: sys.exit(main(argv))
70
+ [rank0]: File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 597, in main
71
+ [rank0]: unet, optimizer = accelerator.prepare(unet, optimizer)
72
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1350, in prepare
73
+ [rank0]: result = tuple(
74
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1351, in <genexpr>
75
+ [rank0]: self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
76
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1226, in _prepare_one
77
+ [rank0]: return self.prepare_model(obj, device_placement=device_placement)
78
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1477, in prepare_model
79
+ [rank0]: model = torch.nn.parallel.DistributedDataParallel(
80
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 858, in __init__
81
+ [rank0]: _verify_param_shape_across_processes(self.process_group, parameters)
82
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/distributed/utils.py", line 281, in _verify_param_shape_across_processes
83
+ [rank0]: return dist._verify_params_across_processes(process_group, tensors, logger)
84
+ [rank0]: RuntimeError: DDP expects same model across all ranks, but Rank 0 has 686 params, while rank 1 has inconsistent 0 params.
wandb/run-20260124_003511-fnfy86iu/files/requirements.txt ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ scipy==1.13.0
2
+ regex==2024.9.11
3
+ sentencepiece==0.2.0
4
+ six==1.16.0
5
+ anyio==4.11.0
6
+ nvidia-cuda-nvrtc-cu12==12.6.77
7
+ scikit-video==1.1.11
8
+ platformdirs==4.5.0
9
+ mypy==1.11.1
10
+ ruff==0.6.5
11
+ charset-normalizer==3.4.4
12
+ torch==2.9.0+cu126
13
+ av==13.1.0
14
+ pillow==10.2.0
15
+ gpustat==1.1.1
16
+ torchvision==0.24.0+cu126
17
+ multidict==6.7.0
18
+ torchmetrics==1.5.1
19
+ aiohttp==3.13.1
20
+ transformers==4.46.1
21
+ decord==0.6.0
22
+ wcwidth==0.2.14
23
+ sphinx-lint==1.0.0
24
+ nvidia-cuda-runtime-cu12==12.6.77
25
+ pytz==2025.2
26
+ codespell==2.3.0
27
+ hpsv2==1.2.0
28
+ mypy_extensions==1.1.0
29
+ numpy==1.26.3
30
+ omegaconf==2.3.0
31
+ Markdown==3.9
32
+ tzdata==2025.2
33
+ pandas==2.2.3
34
+ pytorch-lightning==2.4.0
35
+ aiosignal==1.4.0
36
+ aiohappyeyeballs==2.6.1
37
+ python-dateutil==2.9.0.post0
38
+ seaborn==0.13.2
39
+ beautifulsoup4==4.12.3
40
+ isort==5.13.2
41
+ httpx==0.28.1
42
+ certifi==2025.10.5
43
+ ml_collections==1.1.0
44
+ nvidia-cudnn-cu12==9.10.2.21
45
+ hf-xet==1.2.0
46
+ requests==2.31.0
47
+ inflect==6.0.4
48
+ iniconfig==2.1.0
49
+ braceexpand==0.1.7
50
+ h5py==3.12.1
51
+ wandb==0.18.5
52
+ protobuf==3.20.3
53
+ ninja==1.13.0
54
+ kiwisolver==1.4.9
55
+ networkx==3.3
56
+ packaging==25.0
57
+ fvcore==0.1.5.post20221221
58
+ pyparsing==3.2.5
59
+ starlette==0.41.3
60
+ frozenlist==1.8.0
61
+ docker-pycreds==0.4.0
62
+ Werkzeug==3.1.3
63
+ MarkupSafe==2.1.5
64
+ einops==0.8.0
65
+ sentry-sdk==2.42.0
66
+ PyYAML==6.0.1
67
+ nvidia-nccl-cu12==2.27.5
68
+ datasets==4.3.0
69
+ polib==1.2.0
70
+ safetensors==0.6.2
71
+ async-timeout==5.0.1
72
+ setproctitle==1.3.7
73
+ clint==0.5.1
74
+ matplotlib==3.9.2
75
+ propcache==0.4.1
76
+ termcolor==3.1.0
77
+ antlr4-python3-runtime==4.9.3
78
+ cycler==0.12.1
79
+ fastvideo==1.2.0
80
+ toml==0.10.2
81
+ xxhash==3.6.0
82
+ wheel==0.44.0
83
+ albumentations==1.4.20
84
+ fastapi==0.115.3
85
+ nvidia-cufft-cu12==11.3.0.4
86
+ yarl==1.22.0
87
+ psutil==7.1.0
88
+ tensorboard-data-server==0.7.2
89
+ pydantic==2.9.2
90
+ nvidia-nvtx-cu12==12.6.77
91
+ portalocker==3.2.0
92
+ triton==3.5.0
93
+ annotated-types==0.7.0
94
+ proglog==0.1.12
95
+ nvidia-cusparselt-cu12==0.7.1
96
+ yapf==0.32.0
97
+ Jinja2==3.1.6
98
+ types-requests==2.32.4.20250913
99
+ lightning-utilities==0.15.2
100
+ grpcio==1.75.1
101
+ uvicorn==0.32.0
102
+ typing_extensions==4.15.0
103
+ nvidia-nvjitlink-cu12==12.6.85
104
+ watch==0.2.7
105
+ moviepy==1.0.3
106
+ timm==1.0.11
107
+ pytest-split==0.8.0
108
+ gdown==5.2.0
109
+ types-setuptools==80.9.0.20250822
110
+ nvidia-cusolver-cu12==11.7.1.2
111
+ types-PyYAML==6.0.12.20250915
112
+ pip==25.2
113
+ qwen-vl-utils==0.0.14
114
+ soupsieve==2.8
115
+ zipp==3.23.0
116
+ flash_attn==2.8.3
117
+ yacs==0.1.8
118
+ diffusers==0.32.0
119
+ pluggy==1.6.0
120
+ opencv-python-headless==4.11.0.86
121
+ mpmath==1.3.0
122
+ test_tube==0.7.5
123
+ stringzilla==4.2.1
124
+ fonttools==4.60.1
125
+ nvidia-ml-py==13.580.82
126
+ parameterized==0.9.0
127
+ loguru==0.7.3
128
+ tabulate==0.9.0
129
+ idna==3.6
130
+ iopath==0.1.10
131
+ decorator==4.4.2
132
+ nvidia-cufile-cu12==1.11.1.6
133
+ threadpoolctl==3.6.0
134
+ pyarrow==21.0.0
135
+ httpcore==1.0.9
136
+ hydra-core==1.3.2
137
+ multiprocess==0.70.16
138
+ contourpy==1.3.2
139
+ clip==1.0
140
+ tqdm==4.66.5
141
+ open_clip_torch==3.2.0
142
+ accelerate==1.0.1
143
+ gitdb==4.0.12
144
+ importlib_metadata==8.7.0
145
+ nvidia-cublas-cu12==12.6.4.1
146
+ h11==0.16.0
147
+ filelock==3.19.1
148
+ liger_kernel==0.4.1
149
+ click==8.3.0
150
+ urllib3==2.2.0
151
+ imageio-ffmpeg==0.5.1
152
+ setuptools==80.9.0
153
+ joblib==1.5.2
154
+ tensorboard==2.20.0
155
+ attrs==25.4.0
156
+ future==1.0.0
157
+ albucore==0.0.19
158
+ fsspec==2025.9.0
159
+ sympy==1.14.0
160
+ eval_type_backport==0.2.2
161
+ pydantic_core==2.23.4
162
+ sniffio==1.3.1
163
+ nvidia-nvshmem-cu12==3.3.20
164
+ exceptiongroup==1.3.0
165
+ smmap==5.0.2
166
+ tomli==2.0.2
167
+ ftfy==6.3.0
168
+ dill==0.4.0
169
+ pytest==7.2.0
170
+ PySocks==1.7.1
171
+ nvidia-curand-cu12==10.3.7.77
172
+ tokenizers==0.20.1
173
+ args==0.1.0
174
+ fairscale==0.4.13
175
+ peft==0.13.2
176
+ webdataset==1.0.2
177
+ huggingface-hub==0.26.1
178
+ GitPython==3.1.45
179
+ pytorchvideo==0.1.5
180
+ scikit-learn==1.5.2
181
+ bitsandbytes==0.48.1
182
+ nvidia-cusparse-cu12==12.5.4.2
183
+ nvidia-cuda-cupti-cu12==12.6.80
184
+ imageio==2.36.0
185
+ pydub==0.25.1
186
+ image-reward==1.5
187
+ absl-py==2.3.1
188
+ blessed==1.22.0
189
+ torchdiffeq==0.2.4
wandb/run-20260124_003511-fnfy86iu/files/wandb-metadata.json ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-6.8.0-85-generic-x86_64-with-glibc2.35",
3
+ "python": "3.10.19",
4
+ "startedAt": "2026-01-23T16:35:11.374381Z",
5
+ "args": [
6
+ "--config",
7
+ "fastvideo/config_sd/base.py",
8
+ "--eta_step_list",
9
+ "0,1,2,3,4,5,6,7",
10
+ "--eta_step_merge_list",
11
+ "1,1,1,2,2,2,3,3",
12
+ "--granular_list",
13
+ "1",
14
+ "--num_generations",
15
+ "4",
16
+ "--eta",
17
+ "1.0",
18
+ "--init_same_noise"
19
+ ],
20
+ "program": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py",
21
+ "codePath": "fastvideo/train_g2rpo_sd_merge.py",
22
+ "email": "zhangemail1428@163.com",
23
+ "root": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code",
24
+ "host": "abc",
25
+ "username": "zsj",
26
+ "executable": "/home/zsj/anaconda3/envs/g2rpo/bin/python",
27
+ "codePathLocal": "fastvideo/train_g2rpo_sd_merge.py",
28
+ "cpu_count": 48,
29
+ "cpu_count_logical": 96,
30
+ "gpu": "NVIDIA RTX 5880 Ada Generation",
31
+ "gpu_count": 8,
32
+ "disk": {
33
+ "/": {
34
+ "total": "1006773899264",
35
+ "used": "812103774208"
36
+ }
37
+ },
38
+ "memory": {
39
+ "total": "540697260032"
40
+ },
41
+ "cpu": {
42
+ "count": 48,
43
+ "countLogical": 96
44
+ },
45
+ "gpu_nvidia": [
46
+ {
47
+ "name": "NVIDIA RTX 5880 Ada Generation",
48
+ "memoryTotal": "51527024640",
49
+ "cudaCores": 14080,
50
+ "architecture": "Ada"
51
+ },
52
+ {
53
+ "name": "NVIDIA RTX 5880 Ada Generation",
54
+ "memoryTotal": "51527024640",
55
+ "cudaCores": 14080,
56
+ "architecture": "Ada"
57
+ },
58
+ {
59
+ "name": "NVIDIA RTX 5880 Ada Generation",
60
+ "memoryTotal": "51527024640",
61
+ "cudaCores": 14080,
62
+ "architecture": "Ada"
63
+ },
64
+ {
65
+ "name": "NVIDIA RTX 5880 Ada Generation",
66
+ "memoryTotal": "51527024640",
67
+ "cudaCores": 14080,
68
+ "architecture": "Ada"
69
+ },
70
+ {
71
+ "name": "NVIDIA RTX 5880 Ada Generation",
72
+ "memoryTotal": "51527024640",
73
+ "cudaCores": 14080,
74
+ "architecture": "Ada"
75
+ },
76
+ {
77
+ "name": "NVIDIA RTX 5880 Ada Generation",
78
+ "memoryTotal": "51527024640",
79
+ "cudaCores": 14080,
80
+ "architecture": "Ada"
81
+ },
82
+ {
83
+ "name": "NVIDIA RTX 5880 Ada Generation",
84
+ "memoryTotal": "51527024640",
85
+ "cudaCores": 14080,
86
+ "architecture": "Ada"
87
+ },
88
+ {
89
+ "name": "NVIDIA RTX 5880 Ada Generation",
90
+ "memoryTotal": "51527024640",
91
+ "cudaCores": 14080,
92
+ "architecture": "Ada"
93
+ }
94
+ ],
95
+ "cudaVersion": "12.9"
96
+ }
wandb/run-20260124_003511-fnfy86iu/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"_wandb":{"runtime":666}}
wandb/run-20260124_003511-fnfy86iu/logs/debug-core.log ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T00:35:09.510417423+08:00","level":"INFO","msg":"started logging, with flags","port-filename":"/tmp/tmpmrd01299/port-583399.txt","pid":583399,"debug":false,"disable-analytics":false}
2
+ {"time":"2026-01-24T00:35:09.510458378+08:00","level":"INFO","msg":"FeatureState","shutdownOnParentExitEnabled":false}
3
+ {"time":"2026-01-24T00:35:09.511463258+08:00","level":"INFO","msg":"Will exit if parent process dies.","ppid":583399}
4
+ {"time":"2026-01-24T00:35:09.511480485+08:00","level":"INFO","msg":"server is running","addr":{"IP":"127.0.0.1","Port":37279,"Zone":""}}
5
+ {"time":"2026-01-24T00:35:09.680994997+08:00","level":"INFO","msg":"connection: ManageConnectionData: new connection created","id":"127.0.0.1:60944"}
6
+ {"time":"2026-01-24T00:35:11.378863134+08:00","level":"INFO","msg":"handleInformInit: received","streamId":"fnfy86iu","id":"127.0.0.1:60944"}
7
+ {"time":"2026-01-24T00:35:11.498973378+08:00","level":"INFO","msg":"handleInformInit: stream started","streamId":"fnfy86iu","id":"127.0.0.1:60944"}
8
+ {"time":"2026-01-24T00:46:17.507921807+08:00","level":"INFO","msg":"handleInformTeardown: server teardown initiated","id":"127.0.0.1:60944"}
9
+ {"time":"2026-01-24T00:46:17.508062689+08:00","level":"INFO","msg":"connection: Close: initiating connection closure","id":"127.0.0.1:60944"}
10
+ {"time":"2026-01-24T00:46:17.508144622+08:00","level":"INFO","msg":"server is shutting down"}
11
+ {"time":"2026-01-24T00:46:17.50824603+08:00","level":"INFO","msg":"connection: Close: connection successfully closed","id":"127.0.0.1:60944"}
12
+ {"time":"2026-01-24T00:46:18.435088972+08:00","level":"INFO","msg":"Parent process exited, terminating service process."}
wandb/run-20260124_003511-fnfy86iu/logs/debug-internal.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T00:35:11.379241849+08:00","level":"INFO","msg":"using version","core version":"0.18.5"}
2
+ {"time":"2026-01-24T00:35:11.379275289+08:00","level":"INFO","msg":"created symlink","path":"/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_003511-fnfy86iu/logs/debug-core.log"}
3
+ {"time":"2026-01-24T00:35:11.498908689+08:00","level":"INFO","msg":"created new stream","id":"fnfy86iu"}
4
+ {"time":"2026-01-24T00:35:11.498965971+08:00","level":"INFO","msg":"stream: started","id":"fnfy86iu"}
5
+ {"time":"2026-01-24T00:35:11.499204509+08:00","level":"INFO","msg":"handler: started","stream_id":{"value":"fnfy86iu"}}
6
+ {"time":"2026-01-24T00:35:11.499270739+08:00","level":"INFO","msg":"writer: Do: started","stream_id":{"value":"fnfy86iu"}}
7
+ {"time":"2026-01-24T00:35:11.499381171+08:00","level":"INFO","msg":"sender: started","stream_id":"fnfy86iu"}
8
+ {"time":"2026-01-24T00:35:12.616857928+08:00","level":"INFO","msg":"Starting system monitor"}
9
+ {"time":"2026-01-24T00:46:17.508040252+08:00","level":"INFO","msg":"stream: closing","id":"fnfy86iu"}
10
+ {"time":"2026-01-24T00:46:17.508123223+08:00","level":"INFO","msg":"Stopping system monitor"}
11
+ {"time":"2026-01-24T00:46:17.509233475+08:00","level":"INFO","msg":"Stopped system monitor"}
12
+ {"time":"2026-01-24T00:46:17.97992374+08:00","level":"WARN","msg":"No job ingredients found, not creating job artifact"}
13
+ {"time":"2026-01-24T00:46:17.979956234+08:00","level":"WARN","msg":"No source type found, not creating job artifact"}
14
+ {"time":"2026-01-24T00:46:17.979968114+08:00","level":"INFO","msg":"sender: sendDefer: no job artifact to save"}
wandb/run-20260124_003511-fnfy86iu/logs/debug.log ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-24 00:35:11,371 INFO MainThread:583399 [wandb_setup.py:_flush():79] Current SDK version is 0.18.5
2
+ 2026-01-24 00:35:11,371 INFO MainThread:583399 [wandb_setup.py:_flush():79] Configure stats pid to 583399
3
+ 2026-01-24 00:35:11,371 INFO MainThread:583399 [wandb_setup.py:_flush():79] Loading settings from /home/zsj/.config/wandb/settings
4
+ 2026-01-24 00:35:11,371 INFO MainThread:583399 [wandb_setup.py:_flush():79] Loading settings from /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/settings
5
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_setup.py:_flush():79] Loading settings from environment variables: {}
6
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': None, '_disable_service': None}
7
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': 'fastvideo/train_g2rpo_sd_merge.py', 'program_abspath': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py', 'program': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py'}
8
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_setup.py:_flush():79] Applying login settings: {}
9
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:_log_setup():534] Logging user logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_003511-fnfy86iu/logs/debug.log
10
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:_log_setup():535] Logging internal logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_003511-fnfy86iu/logs/debug-internal.log
11
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:init():621] calling init triggers
12
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:init():628] wandb.init called with sweep_config: {}
13
+ config: {}
14
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:init():671] starting backend
15
+ 2026-01-24 00:35:11,372 INFO MainThread:583399 [wandb_init.py:init():675] sending inform_init request
16
+ 2026-01-24 00:35:11,373 INFO MainThread:583399 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
17
+ 2026-01-24 00:35:11,374 INFO MainThread:583399 [wandb_init.py:init():688] backend started and connected
18
+ 2026-01-24 00:35:11,376 INFO MainThread:583399 [wandb_init.py:init():783] updated telemetry
19
+ 2026-01-24 00:35:11,377 INFO MainThread:583399 [wandb_init.py:init():816] communicating run to backend with 90.0 second timeout
20
+ 2026-01-24 00:35:12,610 INFO MainThread:583399 [wandb_init.py:init():867] starting run threads in backend
21
+ 2026-01-24 00:35:12,765 INFO MainThread:583399 [wandb_run.py:_console_start():2463] atexit reg
22
+ 2026-01-24 00:35:12,765 INFO MainThread:583399 [wandb_run.py:_redirect():2311] redirect: wrap_raw
23
+ 2026-01-24 00:35:12,765 INFO MainThread:583399 [wandb_run.py:_redirect():2376] Wrapping output streams.
24
+ 2026-01-24 00:35:12,765 INFO MainThread:583399 [wandb_run.py:_redirect():2401] Redirects installed.
25
+ 2026-01-24 00:35:12,767 INFO MainThread:583399 [wandb_init.py:init():911] run started, returning control to user process
26
+ 2026-01-24 00:35:12,767 INFO MainThread:583399 [wandb_run.py:_config_callback():1390] config_cb None None {'allow_tf32': True, 'logdir': 'logs', 'mixed_precision': 'bf16', 'num_checkpoint_limit': 5, 'num_epochs': 300, 'pretrained': {'model': './data/StableDiffusion', 'revision': 'main'}, 'prompt_fn': 'imagenet_animals', 'prompt_fn_kwargs': {}, 'resume_from': '', 'reward_fn': 'hpsv2', 'run_name': '2026.01.24_00.34.56', 'sample': {'batch_size': 1, 'eta': 1.0, 'guidance_scale': 5.0, 'num_batches_per_epoch': 2, 'num_steps': 50}, 'save_freq': 20, 'seed': 42, 'train': {'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'adam_weight_decay': 0.0001, 'adv_clip_max': 5, 'batch_size': 1, 'cfg': True, 'clip_range': 0.0001, 'gradient_accumulation_steps': 1, 'learning_rate': 1e-05, 'max_grad_norm': 1.0, 'num_inner_epochs': 1, 'timestep_fraction': 1.0, 'use_8bit_adam': False}, 'use_lora': False}
27
+ 2026-01-24 00:46:17,508 WARNING MsgRouterThr:583399 [router.py:message_loop():77] message_loop has been closed
wandb/run-20260124_022230-0y3z9z7o/files/config.yaml ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _wandb:
2
+ value:
3
+ cli_version: 0.18.5
4
+ m: []
5
+ python_version: 3.10.19
6
+ t:
7
+ "1":
8
+ - 1
9
+ - 11
10
+ - 41
11
+ - 49
12
+ - 55
13
+ - 71
14
+ - 83
15
+ - 98
16
+ "2":
17
+ - 1
18
+ - 11
19
+ - 41
20
+ - 49
21
+ - 55
22
+ - 63
23
+ - 71
24
+ - 83
25
+ - 98
26
+ "3":
27
+ - 13
28
+ - 23
29
+ - 55
30
+ "4": 3.10.19
31
+ "5": 0.18.5
32
+ "6": 4.46.1
33
+ "8":
34
+ - 5
35
+ "12": 0.18.5
36
+ "13": linux-x86_64
37
+ allow_tf32:
38
+ value: true
39
+ logdir:
40
+ value: logs
41
+ mixed_precision:
42
+ value: bf16
43
+ num_checkpoint_limit:
44
+ value: 5
45
+ num_epochs:
46
+ value: 300
47
+ pretrained:
48
+ value:
49
+ model: ./data/StableDiffusion
50
+ revision: main
51
+ prompt_fn:
52
+ value: imagenet_animals
53
+ resume_from:
54
+ value: ""
55
+ reward_fn:
56
+ value: hpsv2
57
+ run_name:
58
+ value: 2026.01.24_02.22.28
59
+ sample:
60
+ value:
61
+ batch_size: 1
62
+ eta: 1
63
+ guidance_scale: 5
64
+ num_batches_per_epoch: 2
65
+ num_steps: 50
66
+ save_freq:
67
+ value: 20
68
+ seed:
69
+ value: 42
70
+ train:
71
+ value:
72
+ adam_beta1: 0.9
73
+ adam_beta2: 0.999
74
+ adam_epsilon: 1e-08
75
+ adam_weight_decay: 0.0001
76
+ adv_clip_max: 5
77
+ batch_size: 1
78
+ cfg: true
79
+ clip_range: 0.0001
80
+ gradient_accumulation_steps: 1
81
+ learning_rate: 1e-05
82
+ max_grad_norm: 1
83
+ num_inner_epochs: 1
84
+ timestep_fraction: 1
85
+ use_8bit_adam: false
86
+ use_lora:
87
+ value: false
wandb/run-20260124_022230-0y3z9z7o/files/output.log ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ I0124 02:22:31.450613 138092014643008 train_g2rpo_sd_merge.py:465]
2
+ allow_tf32: true
3
+ logdir: logs
4
+ mixed_precision: bf16
5
+ num_checkpoint_limit: 5
6
+ num_epochs: 300
7
+ pretrained:
8
+ model: ./data/StableDiffusion
9
+ revision: main
10
+ prompt_fn: imagenet_animals
11
+ prompt_fn_kwargs: {}
12
+ resume_from: ''
13
+ reward_fn: hpsv2
14
+ run_name: 2026.01.24_02.22.28
15
+ sample:
16
+ batch_size: 1
17
+ eta: 1.0
18
+ guidance_scale: 5.0
19
+ num_batches_per_epoch: 2
20
+ num_steps: 50
21
+ save_freq: 20
22
+ seed: 42
23
+ train:
24
+ adam_beta1: 0.9
25
+ adam_beta2: 0.999
26
+ adam_epsilon: 1.0e-08
27
+ adam_weight_decay: 0.0001
28
+ adv_clip_max: 5
29
+ batch_size: 1
30
+ cfg: true
31
+ clip_range: 0.0001
32
+ gradient_accumulation_steps: 1
33
+ learning_rate: 1.0e-05
34
+ max_grad_norm: 1.0
35
+ num_inner_epochs: 1
36
+ timestep_fraction: 1.0
37
+ use_8bit_adam: false
38
+ use_lora: false
39
+
40
+ Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00, 2.47it/s]
41
+ /home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
42
+ warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
43
+ I0124 02:22:34.955836 138092014643008 factory.py:159] Loaded ViT-H-14 model config.
44
+ I0124 02:22:40.351596 138092014643008 factory.py:207] Loading pretrained ViT-H-14 weights (./data/hps/open_clip_pytorch_model.bin).
45
+ Traceback (most recent call last):
46
+ File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 930, in <module>
47
+ app.run(main)
48
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 316, in run
49
+ _run_main(main, args)
50
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 261, in _run_main
51
+ sys.exit(main(argv))
52
+ File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 603, in main
53
+ unet, optimizer = accelerator.prepare(unet, optimizer)
54
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1350, in prepare
55
+ result = tuple(
56
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1351, in <genexpr>
57
+ self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
58
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1226, in _prepare_one
59
+ return self.prepare_model(obj, device_placement=device_placement)
60
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1477, in prepare_model
61
+ model = torch.nn.parallel.DistributedDataParallel(
62
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 858, in __init__
63
+ _verify_param_shape_across_processes(self.process_group, parameters)
64
+ File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/distributed/utils.py", line 281, in _verify_param_shape_across_processes
65
+ return dist._verify_params_across_processes(process_group, tensors, logger)
66
+ torch.distributed.DistBackendError: NCCL error in: /pytorch/torch/csrc/distributed/c10d/NCCLUtils.cpp:94, invalid usage (run with NCCL_DEBUG=WARN for details), NCCL version 2.27.5
67
+ ncclInvalidUsage: This usually reflects invalid usage of NCCL library.
68
+ Last error:
69
+ Duplicate GPU detected : rank 0 and rank 6 both on CUDA device 2c000
70
+ [rank0]: Traceback (most recent call last):
71
+ [rank0]: File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 930, in <module>
72
+ [rank0]: app.run(main)
73
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 316, in run
74
+ [rank0]: _run_main(main, args)
75
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/absl/app.py", line 261, in _run_main
76
+ [rank0]: sys.exit(main(argv))
77
+ [rank0]: File "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py", line 603, in main
78
+ [rank0]: unet, optimizer = accelerator.prepare(unet, optimizer)
79
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1350, in prepare
80
+ [rank0]: result = tuple(
81
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1351, in <genexpr>
82
+ [rank0]: self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
83
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1226, in _prepare_one
84
+ [rank0]: return self.prepare_model(obj, device_placement=device_placement)
85
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/accelerate/accelerator.py", line 1477, in prepare_model
86
+ [rank0]: model = torch.nn.parallel.DistributedDataParallel(
87
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 858, in __init__
88
+ [rank0]: _verify_param_shape_across_processes(self.process_group, parameters)
89
+ [rank0]: File "/home/zsj/anaconda3/envs/g2rpo/lib/python3.10/site-packages/torch/distributed/utils.py", line 281, in _verify_param_shape_across_processes
90
+ [rank0]: return dist._verify_params_across_processes(process_group, tensors, logger)
91
+ [rank0]: torch.distributed.DistBackendError: NCCL error in: /pytorch/torch/csrc/distributed/c10d/NCCLUtils.cpp:94, invalid usage (run with NCCL_DEBUG=WARN for details), NCCL version 2.27.5
92
+ [rank0]: ncclInvalidUsage: This usually reflects invalid usage of NCCL library.
93
+ [rank0]: Last error:
94
+ [rank0]: Duplicate GPU detected : rank 0 and rank 6 both on CUDA device 2c000
wandb/run-20260124_022230-0y3z9z7o/files/requirements.txt ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ scipy==1.13.0
2
+ regex==2024.9.11
3
+ sentencepiece==0.2.0
4
+ six==1.16.0
5
+ anyio==4.11.0
6
+ nvidia-cuda-nvrtc-cu12==12.6.77
7
+ scikit-video==1.1.11
8
+ platformdirs==4.5.0
9
+ mypy==1.11.1
10
+ ruff==0.6.5
11
+ charset-normalizer==3.4.4
12
+ torch==2.9.0+cu126
13
+ av==13.1.0
14
+ pillow==10.2.0
15
+ gpustat==1.1.1
16
+ torchvision==0.24.0+cu126
17
+ multidict==6.7.0
18
+ torchmetrics==1.5.1
19
+ aiohttp==3.13.1
20
+ transformers==4.46.1
21
+ decord==0.6.0
22
+ wcwidth==0.2.14
23
+ sphinx-lint==1.0.0
24
+ nvidia-cuda-runtime-cu12==12.6.77
25
+ pytz==2025.2
26
+ codespell==2.3.0
27
+ hpsv2==1.2.0
28
+ mypy_extensions==1.1.0
29
+ numpy==1.26.3
30
+ omegaconf==2.3.0
31
+ Markdown==3.9
32
+ tzdata==2025.2
33
+ pandas==2.2.3
34
+ pytorch-lightning==2.4.0
35
+ aiosignal==1.4.0
36
+ aiohappyeyeballs==2.6.1
37
+ python-dateutil==2.9.0.post0
38
+ seaborn==0.13.2
39
+ beautifulsoup4==4.12.3
40
+ isort==5.13.2
41
+ httpx==0.28.1
42
+ certifi==2025.10.5
43
+ ml_collections==1.1.0
44
+ nvidia-cudnn-cu12==9.10.2.21
45
+ hf-xet==1.2.0
46
+ requests==2.31.0
47
+ inflect==6.0.4
48
+ iniconfig==2.1.0
49
+ braceexpand==0.1.7
50
+ h5py==3.12.1
51
+ wandb==0.18.5
52
+ protobuf==3.20.3
53
+ ninja==1.13.0
54
+ kiwisolver==1.4.9
55
+ networkx==3.3
56
+ packaging==25.0
57
+ fvcore==0.1.5.post20221221
58
+ pyparsing==3.2.5
59
+ starlette==0.41.3
60
+ frozenlist==1.8.0
61
+ docker-pycreds==0.4.0
62
+ Werkzeug==3.1.3
63
+ MarkupSafe==2.1.5
64
+ einops==0.8.0
65
+ sentry-sdk==2.42.0
66
+ PyYAML==6.0.1
67
+ nvidia-nccl-cu12==2.27.5
68
+ datasets==4.3.0
69
+ polib==1.2.0
70
+ safetensors==0.6.2
71
+ async-timeout==5.0.1
72
+ setproctitle==1.3.7
73
+ clint==0.5.1
74
+ matplotlib==3.9.2
75
+ propcache==0.4.1
76
+ termcolor==3.1.0
77
+ antlr4-python3-runtime==4.9.3
78
+ cycler==0.12.1
79
+ fastvideo==1.2.0
80
+ toml==0.10.2
81
+ xxhash==3.6.0
82
+ wheel==0.44.0
83
+ albumentations==1.4.20
84
+ fastapi==0.115.3
85
+ nvidia-cufft-cu12==11.3.0.4
86
+ yarl==1.22.0
87
+ psutil==7.1.0
88
+ tensorboard-data-server==0.7.2
89
+ pydantic==2.9.2
90
+ nvidia-nvtx-cu12==12.6.77
91
+ portalocker==3.2.0
92
+ triton==3.5.0
93
+ annotated-types==0.7.0
94
+ proglog==0.1.12
95
+ nvidia-cusparselt-cu12==0.7.1
96
+ yapf==0.32.0
97
+ Jinja2==3.1.6
98
+ types-requests==2.32.4.20250913
99
+ lightning-utilities==0.15.2
100
+ grpcio==1.75.1
101
+ uvicorn==0.32.0
102
+ typing_extensions==4.15.0
103
+ nvidia-nvjitlink-cu12==12.6.85
104
+ watch==0.2.7
105
+ moviepy==1.0.3
106
+ timm==1.0.11
107
+ pytest-split==0.8.0
108
+ gdown==5.2.0
109
+ types-setuptools==80.9.0.20250822
110
+ nvidia-cusolver-cu12==11.7.1.2
111
+ types-PyYAML==6.0.12.20250915
112
+ pip==25.2
113
+ qwen-vl-utils==0.0.14
114
+ soupsieve==2.8
115
+ zipp==3.23.0
116
+ flash_attn==2.8.3
117
+ yacs==0.1.8
118
+ diffusers==0.32.0
119
+ pluggy==1.6.0
120
+ opencv-python-headless==4.11.0.86
121
+ mpmath==1.3.0
122
+ test_tube==0.7.5
123
+ stringzilla==4.2.1
124
+ fonttools==4.60.1
125
+ nvidia-ml-py==13.580.82
126
+ parameterized==0.9.0
127
+ loguru==0.7.3
128
+ tabulate==0.9.0
129
+ idna==3.6
130
+ iopath==0.1.10
131
+ decorator==4.4.2
132
+ nvidia-cufile-cu12==1.11.1.6
133
+ threadpoolctl==3.6.0
134
+ pyarrow==21.0.0
135
+ httpcore==1.0.9
136
+ hydra-core==1.3.2
137
+ multiprocess==0.70.16
138
+ contourpy==1.3.2
139
+ clip==1.0
140
+ tqdm==4.66.5
141
+ open_clip_torch==3.2.0
142
+ accelerate==1.0.1
143
+ gitdb==4.0.12
144
+ importlib_metadata==8.7.0
145
+ nvidia-cublas-cu12==12.6.4.1
146
+ h11==0.16.0
147
+ filelock==3.19.1
148
+ liger_kernel==0.4.1
149
+ click==8.3.0
150
+ urllib3==2.2.0
151
+ imageio-ffmpeg==0.5.1
152
+ setuptools==80.9.0
153
+ joblib==1.5.2
154
+ tensorboard==2.20.0
155
+ attrs==25.4.0
156
+ future==1.0.0
157
+ albucore==0.0.19
158
+ fsspec==2025.9.0
159
+ sympy==1.14.0
160
+ eval_type_backport==0.2.2
161
+ pydantic_core==2.23.4
162
+ sniffio==1.3.1
163
+ nvidia-nvshmem-cu12==3.3.20
164
+ exceptiongroup==1.3.0
165
+ smmap==5.0.2
166
+ tomli==2.0.2
167
+ ftfy==6.3.0
168
+ dill==0.4.0
169
+ pytest==7.2.0
170
+ PySocks==1.7.1
171
+ nvidia-curand-cu12==10.3.7.77
172
+ tokenizers==0.20.1
173
+ args==0.1.0
174
+ fairscale==0.4.13
175
+ peft==0.13.2
176
+ webdataset==1.0.2
177
+ huggingface-hub==0.26.1
178
+ GitPython==3.1.45
179
+ pytorchvideo==0.1.5
180
+ scikit-learn==1.5.2
181
+ bitsandbytes==0.48.1
182
+ nvidia-cusparse-cu12==12.5.4.2
183
+ nvidia-cuda-cupti-cu12==12.6.80
184
+ imageio==2.36.0
185
+ pydub==0.25.1
186
+ image-reward==1.5
187
+ absl-py==2.3.1
188
+ blessed==1.22.0
189
+ torchdiffeq==0.2.4
wandb/run-20260124_022230-0y3z9z7o/files/wandb-metadata.json ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-6.8.0-85-generic-x86_64-with-glibc2.35",
3
+ "python": "3.10.19",
4
+ "startedAt": "2026-01-23T18:22:30.277742Z",
5
+ "args": [
6
+ "--config",
7
+ "fastvideo/config_sd/base.py",
8
+ "--eta_step_list",
9
+ "0,1,2,3,4,5,6,7",
10
+ "--eta_step_merge_list",
11
+ "1,1,1,2,2,2,3,3",
12
+ "--granular_list",
13
+ "1",
14
+ "--num_generations",
15
+ "4",
16
+ "--eta",
17
+ "1.0",
18
+ "--init_same_noise"
19
+ ],
20
+ "program": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py",
21
+ "codePath": "fastvideo/train_g2rpo_sd_merge.py",
22
+ "email": "zhangemail1428@163.com",
23
+ "root": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code",
24
+ "host": "abc",
25
+ "username": "zsj",
26
+ "executable": "/home/zsj/anaconda3/envs/g2rpo/bin/python",
27
+ "codePathLocal": "fastvideo/train_g2rpo_sd_merge.py",
28
+ "cpu_count": 48,
29
+ "cpu_count_logical": 96,
30
+ "gpu": "NVIDIA RTX 5880 Ada Generation",
31
+ "gpu_count": 8,
32
+ "disk": {
33
+ "/": {
34
+ "total": "1006773899264",
35
+ "used": "813053333504"
36
+ }
37
+ },
38
+ "memory": {
39
+ "total": "540697260032"
40
+ },
41
+ "cpu": {
42
+ "count": 48,
43
+ "countLogical": 96
44
+ },
45
+ "gpu_nvidia": [
46
+ {
47
+ "name": "NVIDIA RTX 5880 Ada Generation",
48
+ "memoryTotal": "51527024640",
49
+ "cudaCores": 14080,
50
+ "architecture": "Ada"
51
+ },
52
+ {
53
+ "name": "NVIDIA RTX 5880 Ada Generation",
54
+ "memoryTotal": "51527024640",
55
+ "cudaCores": 14080,
56
+ "architecture": "Ada"
57
+ },
58
+ {
59
+ "name": "NVIDIA RTX 5880 Ada Generation",
60
+ "memoryTotal": "51527024640",
61
+ "cudaCores": 14080,
62
+ "architecture": "Ada"
63
+ },
64
+ {
65
+ "name": "NVIDIA RTX 5880 Ada Generation",
66
+ "memoryTotal": "51527024640",
67
+ "cudaCores": 14080,
68
+ "architecture": "Ada"
69
+ },
70
+ {
71
+ "name": "NVIDIA RTX 5880 Ada Generation",
72
+ "memoryTotal": "51527024640",
73
+ "cudaCores": 14080,
74
+ "architecture": "Ada"
75
+ },
76
+ {
77
+ "name": "NVIDIA RTX 5880 Ada Generation",
78
+ "memoryTotal": "51527024640",
79
+ "cudaCores": 14080,
80
+ "architecture": "Ada"
81
+ },
82
+ {
83
+ "name": "NVIDIA RTX 5880 Ada Generation",
84
+ "memoryTotal": "51527024640",
85
+ "cudaCores": 14080,
86
+ "architecture": "Ada"
87
+ },
88
+ {
89
+ "name": "NVIDIA RTX 5880 Ada Generation",
90
+ "memoryTotal": "51527024640",
91
+ "cudaCores": 14080,
92
+ "architecture": "Ada"
93
+ }
94
+ ],
95
+ "cudaVersion": "12.9"
96
+ }
wandb/run-20260124_022230-0y3z9z7o/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"_wandb":{"runtime":15}}
wandb/run-20260124_022230-0y3z9z7o/logs/debug-core.log ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T02:22:29.302091572+08:00","level":"INFO","msg":"started logging, with flags","port-filename":"/tmp/tmptjgtjg7f/port-608086.txt","pid":608086,"debug":false,"disable-analytics":false}
2
+ {"time":"2026-01-24T02:22:29.302119596+08:00","level":"INFO","msg":"FeatureState","shutdownOnParentExitEnabled":false}
3
+ {"time":"2026-01-24T02:22:29.30272344+08:00","level":"INFO","msg":"Will exit if parent process dies.","ppid":608086}
4
+ {"time":"2026-01-24T02:22:29.302734848+08:00","level":"INFO","msg":"server is running","addr":{"IP":"127.0.0.1","Port":42453,"Zone":""}}
5
+ {"time":"2026-01-24T02:22:29.492955085+08:00","level":"INFO","msg":"connection: ManageConnectionData: new connection created","id":"127.0.0.1:51798"}
6
+ {"time":"2026-01-24T02:22:30.281666201+08:00","level":"INFO","msg":"handleInformInit: received","streamId":"0y3z9z7o","id":"127.0.0.1:51798"}
7
+ {"time":"2026-01-24T02:22:30.394942882+08:00","level":"INFO","msg":"handleInformInit: stream started","streamId":"0y3z9z7o","id":"127.0.0.1:51798"}
8
+ {"time":"2026-01-24T02:22:45.992004001+08:00","level":"INFO","msg":"handleInformTeardown: server teardown initiated","id":"127.0.0.1:51798"}
9
+ {"time":"2026-01-24T02:22:45.992302576+08:00","level":"INFO","msg":"server is shutting down"}
10
+ {"time":"2026-01-24T02:22:45.992296318+08:00","level":"INFO","msg":"connection: Close: initiating connection closure","id":"127.0.0.1:51798"}
11
+ {"time":"2026-01-24T02:22:45.992713821+08:00","level":"INFO","msg":"connection: Close: connection successfully closed","id":"127.0.0.1:51798"}
12
+ {"time":"2026-01-24T02:22:48.165257156+08:00","level":"INFO","msg":"Parent process exited, terminating service process."}
wandb/run-20260124_022230-0y3z9z7o/logs/debug-internal.log ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T02:22:30.281841189+08:00","level":"INFO","msg":"using version","core version":"0.18.5"}
2
+ {"time":"2026-01-24T02:22:30.281861284+08:00","level":"INFO","msg":"created symlink","path":"/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_022230-0y3z9z7o/logs/debug-core.log"}
3
+ {"time":"2026-01-24T02:22:30.394846024+08:00","level":"INFO","msg":"created new stream","id":"0y3z9z7o"}
4
+ {"time":"2026-01-24T02:22:30.394931982+08:00","level":"INFO","msg":"stream: started","id":"0y3z9z7o"}
5
+ {"time":"2026-01-24T02:22:30.395106768+08:00","level":"INFO","msg":"sender: started","stream_id":"0y3z9z7o"}
6
+ {"time":"2026-01-24T02:22:30.395039138+08:00","level":"INFO","msg":"handler: started","stream_id":{"value":"0y3z9z7o"}}
7
+ {"time":"2026-01-24T02:22:30.395033137+08:00","level":"INFO","msg":"writer: Do: started","stream_id":{"value":"0y3z9z7o"}}
8
+ {"time":"2026-01-24T02:22:31.287570308+08:00","level":"INFO","msg":"Starting system monitor"}
9
+ {"time":"2026-01-24T02:22:45.992135089+08:00","level":"INFO","msg":"stream: closing","id":"0y3z9z7o"}
10
+ {"time":"2026-01-24T02:22:45.992197139+08:00","level":"INFO","msg":"Stopping system monitor"}
11
+ {"time":"2026-01-24T02:22:45.995895301+08:00","level":"INFO","msg":"Stopped system monitor"}
12
+ {"time":"2026-01-24T02:22:46.363069461+08:00","level":"WARN","msg":"No job ingredients found, not creating job artifact"}
13
+ {"time":"2026-01-24T02:22:46.363103824+08:00","level":"WARN","msg":"No source type found, not creating job artifact"}
14
+ {"time":"2026-01-24T02:22:46.363114999+08:00","level":"INFO","msg":"sender: sendDefer: no job artifact to save"}
15
+ {"time":"2026-01-24T02:22:47.353967974+08:00","level":"INFO","msg":"fileTransfer: Close: file transfer manager closed"}
wandb/run-20260124_022230-0y3z9z7o/logs/debug.log ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-24 02:22:30,273 INFO MainThread:608086 [wandb_setup.py:_flush():79] Current SDK version is 0.18.5
2
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Configure stats pid to 608086
3
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Loading settings from /home/zsj/.config/wandb/settings
4
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Loading settings from /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/settings
5
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Loading settings from environment variables: {}
6
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': None, '_disable_service': None}
7
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': 'fastvideo/train_g2rpo_sd_merge.py', 'program_abspath': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py', 'program': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py'}
8
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_setup.py:_flush():79] Applying login settings: {}
9
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:_log_setup():534] Logging user logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_022230-0y3z9z7o/logs/debug.log
10
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:_log_setup():535] Logging internal logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_022230-0y3z9z7o/logs/debug-internal.log
11
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:init():621] calling init triggers
12
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:init():628] wandb.init called with sweep_config: {}
13
+ config: {}
14
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:init():671] starting backend
15
+ 2026-01-24 02:22:30,274 INFO MainThread:608086 [wandb_init.py:init():675] sending inform_init request
16
+ 2026-01-24 02:22:30,276 INFO MainThread:608086 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
17
+ 2026-01-24 02:22:30,277 INFO MainThread:608086 [wandb_init.py:init():688] backend started and connected
18
+ 2026-01-24 02:22:30,282 INFO MainThread:608086 [wandb_init.py:init():783] updated telemetry
19
+ 2026-01-24 02:22:30,283 INFO MainThread:608086 [wandb_init.py:init():816] communicating run to backend with 90.0 second timeout
20
+ 2026-01-24 02:22:31,277 INFO MainThread:608086 [wandb_init.py:init():867] starting run threads in backend
21
+ 2026-01-24 02:22:31,446 INFO MainThread:608086 [wandb_run.py:_console_start():2463] atexit reg
22
+ 2026-01-24 02:22:31,446 INFO MainThread:608086 [wandb_run.py:_redirect():2311] redirect: wrap_raw
23
+ 2026-01-24 02:22:31,447 INFO MainThread:608086 [wandb_run.py:_redirect():2376] Wrapping output streams.
24
+ 2026-01-24 02:22:31,447 INFO MainThread:608086 [wandb_run.py:_redirect():2401] Redirects installed.
25
+ 2026-01-24 02:22:31,448 INFO MainThread:608086 [wandb_init.py:init():911] run started, returning control to user process
26
+ 2026-01-24 02:22:31,448 INFO MainThread:608086 [wandb_run.py:_config_callback():1390] config_cb None None {'allow_tf32': True, 'logdir': 'logs', 'mixed_precision': 'bf16', 'num_checkpoint_limit': 5, 'num_epochs': 300, 'pretrained': {'model': './data/StableDiffusion', 'revision': 'main'}, 'prompt_fn': 'imagenet_animals', 'prompt_fn_kwargs': {}, 'resume_from': '', 'reward_fn': 'hpsv2', 'run_name': '2026.01.24_02.22.28', 'sample': {'batch_size': 1, 'eta': 1.0, 'guidance_scale': 5.0, 'num_batches_per_epoch': 2, 'num_steps': 50}, 'save_freq': 20, 'seed': 42, 'train': {'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'adam_weight_decay': 0.0001, 'adv_clip_max': 5, 'batch_size': 1, 'cfg': True, 'clip_range': 0.0001, 'gradient_accumulation_steps': 1, 'learning_rate': 1e-05, 'max_grad_norm': 1.0, 'num_inner_epochs': 1, 'timestep_fraction': 1.0, 'use_8bit_adam': False}, 'use_lora': False}
27
+ 2026-01-24 02:22:45,992 WARNING MsgRouterThr:608086 [router.py:message_loop():77] message_loop has been closed
wandb/run-20260124_105101-s3i4k862/files/wandb-metadata.json ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-6.8.0-85-generic-x86_64-with-glibc2.35",
3
+ "python": "3.10.19",
4
+ "startedAt": "2026-01-24T02:51:01.789219Z",
5
+ "args": [
6
+ "--config",
7
+ "fastvideo/config_sd/base.py",
8
+ "--eta_step_list",
9
+ "0,1,2,3,4,5,6,7",
10
+ "--eta_step_merge_list",
11
+ "1,1,1,2,2,2,3,3",
12
+ "--granular_list",
13
+ "1",
14
+ "--num_generations",
15
+ "4",
16
+ "--eta",
17
+ "1.0",
18
+ "--init_same_noise"
19
+ ],
20
+ "program": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py",
21
+ "codePath": "fastvideo/train_g2rpo_sd_merge.py",
22
+ "email": "zhangemail1428@163.com",
23
+ "root": "/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code",
24
+ "host": "abc",
25
+ "username": "zsj",
26
+ "executable": "/home/zsj/anaconda3/envs/g2rpo/bin/python",
27
+ "codePathLocal": "fastvideo/train_g2rpo_sd_merge.py",
28
+ "cpu_count": 48,
29
+ "cpu_count_logical": 96,
30
+ "gpu": "NVIDIA RTX 5880 Ada Generation",
31
+ "gpu_count": 8,
32
+ "disk": {
33
+ "/": {
34
+ "total": "1006773899264",
35
+ "used": "811835744256"
36
+ }
37
+ },
38
+ "memory": {
39
+ "total": "540697260032"
40
+ },
41
+ "cpu": {
42
+ "count": 48,
43
+ "countLogical": 96
44
+ },
45
+ "gpu_nvidia": [
46
+ {
47
+ "name": "NVIDIA RTX 5880 Ada Generation",
48
+ "memoryTotal": "51527024640",
49
+ "cudaCores": 14080,
50
+ "architecture": "Ada"
51
+ },
52
+ {
53
+ "name": "NVIDIA RTX 5880 Ada Generation",
54
+ "memoryTotal": "51527024640",
55
+ "cudaCores": 14080,
56
+ "architecture": "Ada"
57
+ },
58
+ {
59
+ "name": "NVIDIA RTX 5880 Ada Generation",
60
+ "memoryTotal": "51527024640",
61
+ "cudaCores": 14080,
62
+ "architecture": "Ada"
63
+ },
64
+ {
65
+ "name": "NVIDIA RTX 5880 Ada Generation",
66
+ "memoryTotal": "51527024640",
67
+ "cudaCores": 14080,
68
+ "architecture": "Ada"
69
+ },
70
+ {
71
+ "name": "NVIDIA RTX 5880 Ada Generation",
72
+ "memoryTotal": "51527024640",
73
+ "cudaCores": 14080,
74
+ "architecture": "Ada"
75
+ },
76
+ {
77
+ "name": "NVIDIA RTX 5880 Ada Generation",
78
+ "memoryTotal": "51527024640",
79
+ "cudaCores": 14080,
80
+ "architecture": "Ada"
81
+ },
82
+ {
83
+ "name": "NVIDIA RTX 5880 Ada Generation",
84
+ "memoryTotal": "51527024640",
85
+ "cudaCores": 14080,
86
+ "architecture": "Ada"
87
+ },
88
+ {
89
+ "name": "NVIDIA RTX 5880 Ada Generation",
90
+ "memoryTotal": "51527024640",
91
+ "cudaCores": 14080,
92
+ "architecture": "Ada"
93
+ }
94
+ ],
95
+ "cudaVersion": "12.9"
96
+ }
wandb/run-20260124_105101-s3i4k862/logs/debug-core.log ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T10:51:00.7426538+08:00","level":"INFO","msg":"started logging, with flags","port-filename":"/tmp/tmpwxlv3x4b/port-694321.txt","pid":694321,"debug":false,"disable-analytics":false}
2
+ {"time":"2026-01-24T10:51:00.742686139+08:00","level":"INFO","msg":"FeatureState","shutdownOnParentExitEnabled":false}
3
+ {"time":"2026-01-24T10:51:00.743491711+08:00","level":"INFO","msg":"server is running","addr":{"IP":"127.0.0.1","Port":35647,"Zone":""}}
4
+ {"time":"2026-01-24T10:51:00.743589092+08:00","level":"INFO","msg":"Will exit if parent process dies.","ppid":694321}
5
+ {"time":"2026-01-24T10:51:00.933222669+08:00","level":"INFO","msg":"connection: ManageConnectionData: new connection created","id":"127.0.0.1:33952"}
6
+ {"time":"2026-01-24T10:51:01.795205328+08:00","level":"INFO","msg":"handleInformInit: received","streamId":"s3i4k862","id":"127.0.0.1:33952"}
7
+ {"time":"2026-01-24T10:51:01.911996105+08:00","level":"INFO","msg":"handleInformInit: stream started","streamId":"s3i4k862","id":"127.0.0.1:33952"}
8
+ {"time":"2026-01-24T11:02:19.490190123+08:00","level":"INFO","msg":"handleInformTeardown: server teardown initiated","id":"127.0.0.1:33952"}
9
+ {"time":"2026-01-24T11:02:19.490498292+08:00","level":"INFO","msg":"server is shutting down"}
10
+ {"time":"2026-01-24T11:02:19.490490222+08:00","level":"INFO","msg":"connection: Close: initiating connection closure","id":"127.0.0.1:33952"}
11
+ {"time":"2026-01-24T11:02:19.491527379+08:00","level":"INFO","msg":"connection: Close: connection successfully closed","id":"127.0.0.1:33952"}
12
+ {"time":"2026-01-24T11:02:20.129116951+08:00","level":"INFO","msg":"Parent process exited, terminating service process."}
wandb/run-20260124_105101-s3i4k862/logs/debug-internal.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"time":"2026-01-24T10:51:01.795371252+08:00","level":"INFO","msg":"using version","core version":"0.18.5"}
2
+ {"time":"2026-01-24T10:51:01.795385102+08:00","level":"INFO","msg":"created symlink","path":"/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_105101-s3i4k862/logs/debug-core.log"}
3
+ {"time":"2026-01-24T10:51:01.911927136+08:00","level":"INFO","msg":"created new stream","id":"s3i4k862"}
4
+ {"time":"2026-01-24T10:51:01.911986864+08:00","level":"INFO","msg":"stream: started","id":"s3i4k862"}
5
+ {"time":"2026-01-24T10:51:01.912277115+08:00","level":"INFO","msg":"sender: started","stream_id":"s3i4k862"}
6
+ {"time":"2026-01-24T10:51:01.912165824+08:00","level":"INFO","msg":"writer: Do: started","stream_id":{"value":"s3i4k862"}}
7
+ {"time":"2026-01-24T10:51:01.912358876+08:00","level":"INFO","msg":"handler: started","stream_id":{"value":"s3i4k862"}}
8
+ {"time":"2026-01-24T10:51:03.472265752+08:00","level":"INFO","msg":"Starting system monitor"}
9
+ {"time":"2026-01-24T11:02:19.490516218+08:00","level":"INFO","msg":"stream: closing","id":"s3i4k862"}
10
+ {"time":"2026-01-24T11:02:19.490615109+08:00","level":"INFO","msg":"Stopping system monitor"}
11
+ {"time":"2026-01-24T11:02:19.492503467+08:00","level":"INFO","msg":"Stopped system monitor"}
12
+ {"time":"2026-01-24T11:02:19.786591052+08:00","level":"WARN","msg":"No job ingredients found, not creating job artifact"}
13
+ {"time":"2026-01-24T11:02:19.786627546+08:00","level":"WARN","msg":"No source type found, not creating job artifact"}
14
+ {"time":"2026-01-24T11:02:19.786641103+08:00","level":"INFO","msg":"sender: sendDefer: no job artifact to save"}
wandb/run-20260124_105101-s3i4k862/logs/debug.log ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Current SDK version is 0.18.5
2
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Configure stats pid to 694321
3
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Loading settings from /home/zsj/.config/wandb/settings
4
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Loading settings from /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/settings
5
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Loading settings from environment variables: {}
6
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': None, '_disable_service': None}
7
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': 'fastvideo/train_g2rpo_sd_merge.py', 'program_abspath': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py', 'program': '/data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/fastvideo/train_g2rpo_sd_merge.py'}
8
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_setup.py:_flush():79] Applying login settings: {}
9
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:_log_setup():534] Logging user logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_105101-s3i4k862/logs/debug.log
10
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:_log_setup():535] Logging internal logs to /data1/zsj/SceneDPO/Rebuttal/E-GRPO/scoure_code/wandb/run-20260124_105101-s3i4k862/logs/debug-internal.log
11
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:init():621] calling init triggers
12
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:init():628] wandb.init called with sweep_config: {}
13
+ config: {}
14
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:init():671] starting backend
15
+ 2026-01-24 10:51:01,786 INFO MainThread:694321 [wandb_init.py:init():675] sending inform_init request
16
+ 2026-01-24 10:51:01,788 INFO MainThread:694321 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
17
+ 2026-01-24 10:51:01,789 INFO MainThread:694321 [wandb_init.py:init():688] backend started and connected
18
+ 2026-01-24 10:51:01,791 INFO MainThread:694321 [wandb_init.py:init():783] updated telemetry
19
+ 2026-01-24 10:51:01,792 INFO MainThread:694321 [wandb_init.py:init():816] communicating run to backend with 90.0 second timeout
20
+ 2026-01-24 10:51:03,461 INFO MainThread:694321 [wandb_init.py:init():867] starting run threads in backend
21
+ 2026-01-24 10:51:03,631 INFO MainThread:694321 [wandb_run.py:_console_start():2463] atexit reg
22
+ 2026-01-24 10:51:03,631 INFO MainThread:694321 [wandb_run.py:_redirect():2311] redirect: wrap_raw
23
+ 2026-01-24 10:51:03,631 INFO MainThread:694321 [wandb_run.py:_redirect():2376] Wrapping output streams.
24
+ 2026-01-24 10:51:03,631 INFO MainThread:694321 [wandb_run.py:_redirect():2401] Redirects installed.
25
+ 2026-01-24 10:51:03,632 INFO MainThread:694321 [wandb_init.py:init():911] run started, returning control to user process
26
+ 2026-01-24 10:51:03,633 INFO MainThread:694321 [wandb_run.py:_config_callback():1390] config_cb None None {'allow_tf32': True, 'logdir': 'logs', 'mixed_precision': 'bf16', 'num_checkpoint_limit': 5, 'num_epochs': 300, 'pretrained': {'model': './data/StableDiffusion', 'revision': 'main'}, 'prompt_fn': 'imagenet_animals', 'prompt_fn_kwargs': {}, 'resume_from': '', 'reward_fn': 'hpsv2', 'run_name': '2026.01.24_10.51.00', 'sample': {'batch_size': 1, 'eta': 1.0, 'guidance_scale': 5.0, 'num_batches_per_epoch': 2, 'num_steps': 50}, 'save_freq': 20, 'seed': 42, 'train': {'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'adam_weight_decay': 0.0001, 'adv_clip_max': 5, 'batch_size': 1, 'cfg': True, 'clip_range': 0.0001, 'gradient_accumulation_steps': 1, 'learning_rate': 1e-05, 'max_grad_norm': 1.0, 'num_inner_epochs': 1, 'timestep_fraction': 1.0, 'use_8bit_adam': False}, 'use_lora': False}
27
+ 2026-01-24 11:02:19,491 WARNING MsgRouterThr:694321 [router.py:message_loop():77] message_loop has been closed