lsnu commited on
Commit
1759ca7
·
verified ·
1 Parent(s): aa91438

Add files using upload-large-folder tool

Browse files
Files changed (38) hide show
  1. README.md +58 -0
  2. REPORT.md +347 -0
  3. artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_parallel_packed_from_single/config.json +14 -0
  4. artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_parallel_packed_from_single/init_parallel_metadata.json +27 -0
  5. artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_single_pytorch/config.json +7 -0
  6. artifacts/twin_handover_packed_parallelization_20260309/environment/gpu_info.txt +10 -0
  7. artifacts/twin_handover_packed_parallelization_20260309/environment/hf_env.txt +3 -0
  8. artifacts/twin_handover_packed_parallelization_20260309/environment/openpi_source_snapshot.txt +5 -0
  9. artifacts/twin_handover_packed_parallelization_20260309/environment/pip_freeze.txt +242 -0
  10. artifacts/twin_handover_packed_parallelization_20260309/environment/python_env.txt +11 -0
  11. artifacts/twin_handover_packed_parallelization_20260309/environment/selected_env_vars.json +1 -0
  12. artifacts/twin_handover_packed_parallelization_20260309/environment/system_info.txt +7 -0
  13. artifacts/twin_handover_packed_parallelization_20260309/environment/workspace_snapshot.txt +49 -0
  14. artifacts/twin_handover_packed_parallelization_20260309/metrics/norm_stats_verification.txt +9 -0
  15. artifacts/twin_handover_packed_parallelization_20260309/metrics/summary.json +318 -0
  16. artifacts/twin_handover_packed_parallelization_20260309/metrics/train_loss_table.csv +11 -0
  17. artifacts/twin_handover_packed_parallelization_20260309/metrics/val_loss_table.csv +5 -0
  18. artifacts/twin_handover_packed_parallelization_20260309/repro/changed_files.txt +15 -0
  19. artifacts/twin_handover_packed_parallelization_20260309/repro/checkpoint_locations.txt +6 -0
  20. artifacts/twin_handover_packed_parallelization_20260309/repro/commands_reproduce.sh +22 -0
  21. artifacts/twin_handover_packed_parallelization_20260309/run_logs/detach_test.log +2 -0
  22. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k.log +0 -0
  23. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_1000.log +66 -0
  24. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_2000.log +114 -0
  25. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k.log +0 -0
  26. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_1000.log +64 -0
  27. artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_2000.log +114 -0
  28. artifacts/twin_handover_packed_parallelization_20260309/run_logs/importtime_train_pytorch.log +349 -0
  29. artifacts/twin_handover_packed_parallelization_20260309/run_logs/inspect_twin_packed_batch_handover_train.log +176 -0
  30. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20.log +241 -0
  31. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20b.log +0 -0
  32. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20d.log +34 -0
  33. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20e.log +34 -0
  34. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20k.log +234 -0
  35. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20l.log +141 -0
  36. artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_parallel_20a.log +141 -0
  37. artifacts/twin_handover_packed_parallelization_20260309/run_logs/twin_handover_followup.log +37 -0
  38. artifacts/twin_handover_packed_parallelization_20260309/sanity_checks/inspect_twin_packed_batch_handover_train.log +176 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # pi0.5 Packed Multi-Arm OpenPI Artifacts
2
+
3
+ This repo packages a finished initial comparison between:
4
+
5
+ 1. a packed single-head `pi0.5` baseline
6
+ 2. a packed parallel-head `pi0.5` model with an exact packed warm-start from the single-head checkpoint
7
+
8
+ The study was run from the checked-out `openpi/` tree on `4x H100 80GB` with `bfloat16`, `2000` optimizer steps per model, verbose startup/debug logging, fixed validation passes, and no raw data reconversion.
9
+
10
+ ## Dataset and packing
11
+
12
+ - Train repo: `lsnu/twin_handover_256_train`
13
+ - Val repo: `lsnu/twin_handover_256_val`
14
+ - Original TWIN layout: `[L8, R8]`
15
+ - Packed model layout used for both models: `[L8, 0x8, R8, 0x8]`
16
+ - Action-loss mask: active dims `[0:8]` and `[16:24]`, padded dims masked out
17
+ - Public `16`-dim norm stats were reused; they were not recomputed
18
+
19
+ ## Headline results
20
+
21
+ | Model | Val @ 1000 | Val @ 2000 | Train runtime | Peak VRAM |
22
+ | --- | ---: | ---: | ---: | ---: |
23
+ | Packed baseline | `0.052885` | `0.035776` | `33:27` | `35.23 GB` |
24
+ | Packed parallel | `0.051214` | `0.035680` | `30:38` | `35.27 GB` |
25
+
26
+ The two models tracked closely. In this short run, the packed parallel head finished with a small edge on validation loss while staying within the same memory envelope.
27
+
28
+ ## Repo contents
29
+
30
+ - `openpi/`
31
+ - modified training/eval code
32
+ - config and transform changes
33
+ - copied norm-stats assets for the new packed configs
34
+ - smoke and main-run checkpoints under `openpi/checkpoints/`
35
+ - `artifacts/twin_handover_packed_parallelization_20260309/`
36
+ - `bootstrap_checkpoints/`: single-head PyTorch bootstrap and exact packed parallel warm-start
37
+ - `metrics/`: JSON and CSV summaries
38
+ - `run_logs/`: smoke, train, eval, and follow-up logs
39
+ - `sanity_checks/`: packed-batch inspection output
40
+ - `environment/`: system, GPU, package, HF-tooling, and workspace snapshots
41
+ - `repro/`: changed-file list, checkpoint locations, and rerun commands
42
+ - `artifacts/pi05_base_params/`
43
+ - staged base JAX parameter snapshot used for PyTorch conversion
44
+
45
+ ## Key artifact paths
46
+
47
+ - Full report: `REPORT.md`
48
+ - Reproduction commands: `artifacts/twin_handover_packed_parallelization_20260309/repro/commands_reproduce.sh`
49
+ - Metrics summary: `artifacts/twin_handover_packed_parallelization_20260309/metrics/summary.json`
50
+ - Train loss table: `artifacts/twin_handover_packed_parallelization_20260309/metrics/train_loss_table.csv`
51
+ - Val loss table: `artifacts/twin_handover_packed_parallelization_20260309/metrics/val_loss_table.csv`
52
+ - Environment snapshot: `artifacts/twin_handover_packed_parallelization_20260309/environment/`
53
+
54
+ ## Notes
55
+
56
+ - The packed parallel warm-start is exact by construction from the implemented slice/fuse mapping.
57
+ - Weight loading on both main runs reported `missing=0` and `unexpected=0`.
58
+ - The packaged tree intentionally records reproducibility snapshots instead of uploading transient cache state.
REPORT.md ADDED
@@ -0,0 +1,347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Report: pi0.5 Packed Action-Head Parallelization on TWIN Handover
2
+
3
+ ## Objective
4
+
5
+ Run the minimum scientifically meaningful comparison between:
6
+
7
+ 1. a packed single-head `pi0.5` baseline
8
+ 2. a packed parallel-head `pi0.5` model
9
+
10
+ Both models were fine-tuned on the same converted public TWIN handover dataset with the same training schedule:
11
+
12
+ - train: `lsnu/twin_handover_256_train`
13
+ - val: `lsnu/twin_handover_256_val`
14
+ - hardware: `4x H100 80GB`
15
+ - precision: `bfloat16`
16
+ - global batch size: `16`
17
+ - optimizer steps per model: `2000`
18
+ - save interval: `250`
19
+ - log interval: `10`
20
+
21
+ ## Data layout and packing
22
+
23
+ The TWIN converted state/action layout is `16` dims in `[L8, R8]`, where each arm is `7` joints plus gripper. The generic `pi0.5` path right-pads to `32` dims, which does not preserve a semantic left/right split for a naive parallel-head setup.
24
+
25
+ To keep the experiment minimal and still semantically correct:
26
+
27
+ - existing public `16`-dim norm stats were reused
28
+ - semantic packing happened after normalization in model transforms
29
+ - both models consumed the same packed `32`-dim layout:
30
+
31
+ ```text
32
+ [L8, R8] -> [L8, 0x8, R8, 0x8]
33
+ ```
34
+
35
+ - the action loss was masked so only the real arm dims contributed:
36
+
37
+ ```text
38
+ active dims: [0:8] and [16:24]
39
+ masked dims: [8:16] and [24:32]
40
+ ```
41
+
42
+ The packed-batch sanity check confirmed exact zero padding:
43
+
44
+ - `state_padded_zero_count: 16 / 16`
45
+ - `actions_padded_zero_count: 256 / 256`
46
+ - `state_padded_exact_zero: True`
47
+ - `actions_padded_exact_zero: True`
48
+
49
+ Reference log:
50
+
51
+ - `artifacts/twin_handover_packed_parallelization_20260309/sanity_checks/inspect_twin_packed_batch_handover_train.log`
52
+
53
+ ## Code changes tied to files
54
+
55
+ The experiment-specific changes are summarized below.
56
+
57
+ - `openpi/src/openpi/transforms.py`
58
+ - added `PackPerArmBlocks` and `UnpackPerArmBlocks` for semantic TWIN packed training
59
+ - `openpi/src/openpi/training/config.py`
60
+ - added packed TWIN model-transform path
61
+ - added `action_loss_mask`
62
+ - added `pi05_twin_handover_256_packed_baseline_pytorch_2k`
63
+ - added `pi05_twin_handover_256_packed_parallel_pytorch_2k`
64
+ - `openpi/src/openpi/training/data_loader.py`
65
+ - added `set_epoch`
66
+ - improved local dataset mirror handling and loader startup behavior
67
+ - `openpi/src/openpi/models/model.py`
68
+ - made `pi0_pytorch` import lazy
69
+ - `openpi/src/openpi/models/tokenizer.py`
70
+ - made `AutoProcessor` import lazy
71
+ - `openpi/src/openpi/models_pytorch/pi0_pytorch.py`
72
+ - disabled unconditional `sample_actions` `torch.compile` by default
73
+ - `openpi/scripts/train_pytorch.py`
74
+ - added startup prints
75
+ - added masked action-loss reduction
76
+ - added first-steps debug prints and periodic runtime/memory logging
77
+ - hardened DDP/checkpoint startup
78
+ - `openpi/scripts/eval_twin_val_loss_pytorch.py`
79
+ - added masked validation-loss evaluation with fixed-batch execution
80
+ - `openpi/scripts/init_parallel_pi05_from_single_pytorch.py`
81
+ - added exact packed parallel warm-start initialization
82
+ - `openpi/scripts/inspect_twin_packed_batch.py`
83
+ - added packed-batch inspection and zero-padding verification
84
+ - `openpi/scripts/run_twin_handover_packed_followup.sh`
85
+ - added detached follow-up automation for the remaining train/eval stages
86
+ - `openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json`
87
+ - copied the existing handover train norm stats for the packed baseline config
88
+ - `openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json`
89
+ - copied the existing handover train norm stats for the packed parallel config
90
+
91
+ Reference file list:
92
+
93
+ - `artifacts/twin_handover_packed_parallelization_20260309/repro/changed_files.txt`
94
+
95
+ ## Commands run
96
+
97
+ The exact rerun command list is saved in:
98
+
99
+ - `artifacts/twin_handover_packed_parallelization_20260309/repro/commands_reproduce.sh`
100
+
101
+ The executed flow was:
102
+
103
+ 1. packed-batch inspection
104
+ 2. base `pi0.5` JAX-to-PyTorch conversion
105
+ 3. exact packed parallel warm-start initialization from the single-head PyTorch checkpoint
106
+ 4. packed baseline training for `2000` steps
107
+ 5. baseline val at `1000`
108
+ 6. baseline val at `2000`
109
+ 7. packed parallel training for `2000` steps
110
+ 8. parallel val at `1000`
111
+ 9. parallel val at `2000`
112
+
113
+ The parallel training and its validation passes were chained through a detached follow-up runner.
114
+
115
+ Reference logs:
116
+
117
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/twin_handover_followup.log`
118
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k.log`
119
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k.log`
120
+
121
+ ## Startup sanity checks
122
+
123
+ ### Norm stats
124
+
125
+ The copied norm-stats files were loaded successfully and reported:
126
+
127
+ - keys: `['actions', 'state']`
128
+ - `state_mean_len=16`
129
+ - `state_std_len=16`
130
+ - `actions_mean_len=16`
131
+ - `actions_std_len=16`
132
+
133
+ Reference:
134
+
135
+ - `artifacts/twin_handover_packed_parallelization_20260309/metrics/norm_stats_verification.txt`
136
+
137
+ ### Baseline startup summary
138
+
139
+ Rank-0 startup logging for the packed baseline recorded:
140
+
141
+ ```text
142
+ Resolved config name: pi05_twin_handover_256_packed_baseline_pytorch_2k
143
+ Dataset repo_id: lsnu/twin_handover_256_train
144
+ Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16}
145
+ Checkpoint source path: /workspace/checkpoints/pi05_base_single_pytorch
146
+ Model type: baseline
147
+ Packed transforms active: True
148
+ Batch size: local=4, global=16
149
+ Action-loss mask: (1.0 x8, 0.0 x8, 1.0 x8, 0.0 x8)
150
+ Weight loading missing key count: 0
151
+ Weight loading unexpected key count: 0
152
+ ```
153
+
154
+ The first debug steps also showed:
155
+
156
+ - `observation.state shape=(4, 32)`
157
+ - `actions shape=(4, 16, 32)`
158
+ - `state_nonzero_counts_8d_blocks=[32, 0, 32, 0]`
159
+ - `action_nonzero_counts_8d_blocks=[512, 0, 512, 0]`
160
+ - masked padded dims stayed exactly zero in the batch
161
+
162
+ ### Parallel startup summary
163
+
164
+ Rank-0 startup logging for the packed parallel run recorded:
165
+
166
+ ```text
167
+ Resolved config name: pi05_twin_handover_256_packed_parallel_pytorch_2k
168
+ Dataset repo_id: lsnu/twin_handover_256_train
169
+ Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16}
170
+ Checkpoint source path: /workspace/checkpoints/pi05_base_parallel_packed_from_single
171
+ Model type: parallel
172
+ Packed transforms active: True
173
+ Batch size: local=4, global=16
174
+ Action-loss mask: (1.0 x8, 0.0 x8, 1.0 x8, 0.0 x8)
175
+ Weight loading missing key count: 0
176
+ Weight loading unexpected key count: 0
177
+ ```
178
+
179
+ The first debug steps matched the expected packed layout:
180
+
181
+ - `observation.state shape=(4, 32)`
182
+ - `actions shape=(4, 16, 32)`
183
+ - `state_nonzero_counts_8d_blocks=[32, 0, 32, 0]`
184
+ - `action_nonzero_counts_8d_blocks=[512, 0, 512, 0]`
185
+
186
+ ### Smoke tests
187
+
188
+ All required smoke tests passed before the main runs:
189
+
190
+ 1. `debug_pi05_multiarm_pytorch_smoke`
191
+ 2. packed-batch inspection on `lsnu/twin_handover_256_train`
192
+ 3. packed baseline TWIN smoke on `4` GPUs for `20` steps
193
+ 4. packed parallel TWIN smoke on `4` GPUs for `20` steps
194
+
195
+ Smoke logs are stored in:
196
+
197
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20k.log`
198
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20l.log`
199
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_parallel_20a.log`
200
+
201
+ ## Warm-start note
202
+
203
+ The packed parallel warm-start was implemented as an exact slice/fuse mapping from the single-head PyTorch checkpoint:
204
+
205
+ - input side: split single-head input projection by packed arm blocks
206
+ - fuse side: initialize `arm_token_fuse.weight` as `[I I]`
207
+ - output side: split single-head output projection rows by packed arm blocks
208
+
209
+ This was exact by construction for the implemented mapping and both the warm-start checkpoint creation and main-run loading succeeded without missing or unexpected keys.
210
+
211
+ What was not done:
212
+
213
+ - no separate numerical equivalence test was run that compared step-0 forward outputs between the single-head and parallel-head models on the same batch
214
+
215
+ Bootstrap checkpoints:
216
+
217
+ - `/workspace/checkpoints/pi05_base_single_pytorch`
218
+ - `/workspace/checkpoints/pi05_base_parallel_packed_from_single`
219
+
220
+ Copies are also staged under:
221
+
222
+ - `artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/`
223
+
224
+ ## Results
225
+
226
+ ### Training loss snapshots
227
+
228
+ | Model | Step 250 | Step 500 | Step 1000 | Step 1500 | Step 2000 |
229
+ | --- | ---: | ---: | ---: | ---: | ---: |
230
+ | Baseline loss | `0.1975` | `0.0606` | `0.0245` | `0.0155` | `0.0391` |
231
+ | Baseline smoothed | `0.1166` | `0.0554` | `0.0387` | `0.0331` | `0.0278` |
232
+ | Parallel loss | `0.1894` | `0.0633` | `0.0214` | `0.0155` | `0.0326` |
233
+ | Parallel smoothed | `0.1153` | `0.0565` | `0.0392` | `0.0331` | `0.0270` |
234
+
235
+ ### Validation loss
236
+
237
+ | Model | Checkpoint | Batches | Mean val loss | Std val loss |
238
+ | --- | ---: | ---: | ---: | ---: |
239
+ | Baseline | `1000` | `50` | `0.052885` | `0.032533` |
240
+ | Baseline | `2000` | `100` | `0.035776` | `0.027648` |
241
+ | Parallel | `1000` | `50` | `0.051214` | `0.028985` |
242
+ | Parallel | `2000` | `100` | `0.035680` | `0.026077` |
243
+
244
+ ### Runtime and memory
245
+
246
+ | Item | Value |
247
+ | --- | --- |
248
+ | Pipeline wallclock from baseline launch to final val | `01:32:29` |
249
+ | Detached follow-up runner wallclock | `01:17:47` |
250
+ | Baseline train runtime | `33:27` |
251
+ | Parallel train runtime | `30:38` |
252
+ | Baseline val @ 1000 | `00:05:14` |
253
+ | Baseline val @ 2000 | `00:05:19` |
254
+ | Parallel val @ 1000 | `00:03:23` |
255
+ | Parallel val @ 2000 | `00:03:33` |
256
+ | Peak baseline VRAM | `35.23 GB` |
257
+ | Peak parallel VRAM | `35.27 GB` |
258
+
259
+ ### Interpretation
260
+
261
+ For this short `2000`-step TWIN handover run, the packed baseline and packed parallel-head models behaved very similarly. The packed parallel-head model ended slightly lower on both validation checkpoints while staying in the same memory range and training cleanly under the same schedule.
262
+
263
+ This should be treated as an initial profiling run, not a final benchmark claim.
264
+
265
+ Reference metrics:
266
+
267
+ - `artifacts/twin_handover_packed_parallelization_20260309/metrics/summary.json`
268
+ - `artifacts/twin_handover_packed_parallelization_20260309/metrics/train_loss_table.csv`
269
+ - `artifacts/twin_handover_packed_parallelization_20260309/metrics/val_loss_table.csv`
270
+
271
+ ## Checkpoints and logs
272
+
273
+ ### Main-run checkpoints
274
+
275
+ - Baseline step `1000`:
276
+ - `/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000`
277
+ - Baseline step `2000`:
278
+ - `/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000`
279
+ - Parallel step `1000`:
280
+ - `/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000`
281
+ - Parallel step `2000`:
282
+ - `/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000`
283
+
284
+ The full checkpoint trees, including smoke checkpoints and intermediate saves every `250` steps, are under:
285
+
286
+ - `openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/`
287
+ - `openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/`
288
+
289
+ ### Bootstrap checkpoints
290
+
291
+ - `artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_single_pytorch/`
292
+ - `artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_parallel_packed_from_single/`
293
+
294
+ ### Logs
295
+
296
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k.log`
297
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_1000.log`
298
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_2000.log`
299
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k.log`
300
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_1000.log`
301
+ - `artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_2000.log`
302
+
303
+ ## Environment and provenance snapshot
304
+
305
+ Environment snapshots are stored in:
306
+
307
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/system_info.txt`
308
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/gpu_info.txt`
309
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/python_env.txt`
310
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/pip_freeze.txt`
311
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/hf_env.txt`
312
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/selected_env_vars.json`
313
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/workspace_snapshot.txt`
314
+ - `artifacts/twin_handover_packed_parallelization_20260309/environment/openpi_source_snapshot.txt`
315
+
316
+ OpenPI source provenance:
317
+
318
+ - packaged `openpi/` tree does not contain a live `.git` directory
319
+ - source clone snapshot recorded in `openpi_source_snapshot.txt`
320
+ - source commit: `aa91438c0c130dcef4ccf378a56f4cf4cffc1310`
321
+
322
+ ## Acceptance criteria status
323
+
324
+ 1. Packed-batch inspection showed raw `16`-dim `[L8, R8]` and packed `32`-dim `[L8, 0x8, R8, 0x8]`: `PASS`
325
+ 2. Both smoke tests passed on `4` GPUs with finite loss: `PASS`
326
+ 3. Baseline run started from `/workspace/checkpoints/pi05_base_single_pytorch`: `PASS`
327
+ 4. Parallel run started from `/workspace/checkpoints/pi05_base_parallel_packed_from_single`: `PASS`
328
+ 5. Masked loss was active and padded dims were excluded: `PASS`
329
+ 6. DDP ran without shape/key mismatches: `PASS`
330
+ 7. Quick val was run at step `1000` for both models: `PASS`
331
+ 8. Final val was run at step `2000` for both models: `PASS`
332
+ 9. Both main runs finished under the `10`-hour cap: `PASS`
333
+ 10. Final bundle includes code, checkpoints, logs, metrics, and environment snapshot: `PASS`
334
+
335
+ ## Final inventory
336
+
337
+ The artifact bundle at repo root contains:
338
+
339
+ - all modified training/eval code under `openpi/`
340
+ - all baseline and parallel checkpoints under `openpi/checkpoints/`
341
+ - both bootstrap checkpoints under `artifacts/.../bootstrap_checkpoints/`
342
+ - all train/eval/smoke logs under `artifacts/.../run_logs/`
343
+ - metrics tables and summary JSON under `artifacts/.../metrics/`
344
+ - reproducibility files under `artifacts/.../repro/`
345
+ - environment and provenance snapshot under `artifacts/.../environment/`
346
+
347
+ This is a complete rerunnable package for the initial TWIN handover packed action-head parallelization study.
artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_parallel_packed_from_single/config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "action_dim": 32,
3
+ "action_expert_variant": "gemma_300m",
4
+ "action_horizon": 16,
5
+ "arm_action_dims": [
6
+ 16,
7
+ 16
8
+ ],
9
+ "discrete_state_input": true,
10
+ "dtype": "bfloat16",
11
+ "max_token_len": 200,
12
+ "paligemma_variant": "gemma_2b",
13
+ "pi05": true
14
+ }
artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_parallel_packed_from_single/init_parallel_metadata.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "config_name": "pi05_twin_handover_256_packed_parallel_pytorch_2k",
3
+ "input_projection_max_abs_diff": 1.1920928955078125e-06,
4
+ "load_state_missing_keys": [
5
+ "paligemma_with_expert.paligemma.model.language_model.embed_tokens.weight",
6
+ "action_in_proj_arms.0.weight",
7
+ "action_in_proj_arms.0.bias",
8
+ "action_in_proj_arms.1.weight",
9
+ "action_in_proj_arms.1.bias",
10
+ "arm_token_fuse.weight",
11
+ "arm_token_fuse.bias",
12
+ "action_out_proj_arms.0.weight",
13
+ "action_out_proj_arms.0.bias",
14
+ "action_out_proj_arms.1.weight",
15
+ "action_out_proj_arms.1.bias"
16
+ ],
17
+ "load_state_unexpected_keys": [
18
+ "action_in_proj.bias",
19
+ "action_in_proj.weight",
20
+ "action_out_proj.bias",
21
+ "action_out_proj.weight"
22
+ ],
23
+ "output_path": "/workspace/checkpoints/pi05_base_parallel_packed_from_single",
24
+ "output_projection_max_abs_diff": 9.5367431640625e-07,
25
+ "single_ckpt": "/workspace/checkpoints/pi05_base_single_pytorch",
26
+ "warm_start_exact": false
27
+ }
artifacts/twin_handover_packed_parallelization_20260309/bootstrap_checkpoints/pi05_base_single_pytorch/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "action_dim": 32,
3
+ "action_horizon": 16,
4
+ "paligemma_variant": "gemma_2b",
5
+ "action_expert_variant": "gemma_300m",
6
+ "precision": "bfloat16"
7
+ }
artifacts/twin_handover_packed_parallelization_20260309/environment/gpu_info.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:09:46Z
2
+ GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-352e04eb-3fa2-0b3b-c24f-5c9567d275af)
3
+ GPU 1: NVIDIA H100 80GB HBM3 (UUID: GPU-09e17180-0d03-02d6-53c8-863ebf34f1a0)
4
+ GPU 2: NVIDIA H100 80GB HBM3 (UUID: GPU-323a86ac-758a-6993-c4b8-7b0c6cf94b3f)
5
+ GPU 3: NVIDIA H100 80GB HBM3 (UUID: GPU-dfccd461-1fa0-0b62-00da-e9abb74fb025)
6
+
7
+ 0, NVIDIA H100 80GB HBM3, 81559 MiB, 580.126.09
8
+ 1, NVIDIA H100 80GB HBM3, 81559 MiB, 580.126.09
9
+ 2, NVIDIA H100 80GB HBM3, 81559 MiB, 580.126.09
10
+ 3, NVIDIA H100 80GB HBM3, 81559 MiB, 580.126.09
artifacts/twin_handover_packed_parallelization_20260309/environment/hf_env.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:10:06Z
2
+ hf_version=1.6.0
3
+ auth_state=Not logged in
artifacts/twin_handover_packed_parallelization_20260309/environment/openpi_source_snapshot.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:11:23Z
2
+ packaged_openpi_has_git=no
3
+ source_clone_path=/workspace/openpi_partial_broken_1773005128
4
+ source_commit=aa91438c0c130dcef4ccf378a56f4cf4cffc1310
5
+ source_remote=https://huggingface.co/lsnu/pi05tests-openpi-multiarm
artifacts/twin_handover_packed_parallelization_20260309/environment/pip_freeze.txt ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==2.3.0
2
+ aiohappyeyeballs==2.6.1
3
+ aiohttp==3.12.4
4
+ aiosignal==1.3.2
5
+ annotated-types==0.7.0
6
+ antlr4-python3-runtime==4.9.3
7
+ asttokens==3.0.0
8
+ attrs==25.3.0
9
+ augmax==0.4.1
10
+ av==14.4.0
11
+ beartype==0.19.0
12
+ beautifulsoup4==4.13.4
13
+ blinker==1.9.0
14
+ cachetools==5.5.2
15
+ certifi==2025.4.26
16
+ cffi==1.17.1
17
+ cfgv==3.4.0
18
+ charset-normalizer==3.4.2
19
+ chex==0.1.89
20
+ click==8.2.1
21
+ cloudpickle==3.1.1
22
+ cmake==4.0.2
23
+ comm==0.2.2
24
+ contourpy==1.3.2
25
+ crc32c==2.7.1
26
+ cycler==0.12.1
27
+ datasets==3.6.0
28
+ debugpy==1.8.14
29
+ decorator==5.2.1
30
+ deepdiff==8.5.0
31
+ diffusers==0.33.1
32
+ dill==0.3.8
33
+ distlib==0.3.9
34
+ dm-control==1.0.14
35
+ dm-env==1.6
36
+ dm-tree==0.1.9
37
+ docker-pycreds==0.4.0
38
+ docstring-parser==0.16
39
+ donfig==0.8.1.post1
40
+ draccus==0.10.0
41
+ einops==0.8.1
42
+ equinox==0.12.2
43
+ etils==1.12.2
44
+ evdev==1.9.2
45
+ executing==2.2.0
46
+ farama-notifications==0.0.4
47
+ filelock==3.18.0
48
+ flask==3.1.1
49
+ flatbuffers==25.2.10
50
+ flax==0.10.2
51
+ fonttools==4.58.1
52
+ frozenlist==1.6.0
53
+ fsspec==2025.3.0
54
+ gcsfs==2025.3.0
55
+ gdown==5.2.0
56
+ gitdb==4.0.12
57
+ gitpython==3.1.44
58
+ glfw==2.9.0
59
+ google-api-core==2.24.2
60
+ google-auth==2.40.2
61
+ google-auth-oauthlib==1.2.2
62
+ google-cloud-core==2.4.3
63
+ google-cloud-storage==3.1.0
64
+ google-crc32c==1.7.1
65
+ google-resumable-media==2.7.2
66
+ googleapis-common-protos==1.70.0
67
+ gym-aloha==0.1.1
68
+ gymnasium==0.29.1
69
+ h5py==3.13.0
70
+ hf-transfer==0.1.9
71
+ hf-xet==1.1.2
72
+ huggingface-hub==0.32.3
73
+ humanize==4.12.3
74
+ identify==2.6.12
75
+ idna==3.10
76
+ imageio==2.37.0
77
+ imageio-ffmpeg==0.6.0
78
+ importlib-metadata==8.7.0
79
+ importlib-resources==6.5.2
80
+ iniconfig==2.1.0
81
+ inquirerpy==0.3.4
82
+ ipykernel==6.29.5
83
+ ipython==9.2.0
84
+ ipython-pygments-lexers==1.1.1
85
+ ipywidgets==8.1.7
86
+ itsdangerous==2.2.0
87
+ jax==0.5.3
88
+ jax-cuda12-pjrt==0.5.3
89
+ jax-cuda12-plugin==0.5.3
90
+ jaxlib==0.5.3
91
+ jaxtyping==0.2.36
92
+ jedi==0.19.2
93
+ jinja2==3.1.6
94
+ jsonlines==4.0.0
95
+ jupyter-client==8.6.3
96
+ jupyter-core==5.8.1
97
+ jupyterlab-widgets==3.0.15
98
+ kiwisolver==1.4.8
99
+ labmaze==1.0.6
100
+ lerobot @ git+https://github.com/huggingface/lerobot@0cf864870cf29f4738d3ade893e6fd13fbd7cdb5
101
+ llvmlite==0.44.0
102
+ lxml==5.4.0
103
+ markdown-it-py==3.0.0
104
+ markupsafe==3.0.2
105
+ matplotlib==3.10.3
106
+ matplotlib-inline==0.1.7
107
+ mdurl==0.1.2
108
+ mergedeep==1.3.4
109
+ ml-collections==1.0.0
110
+ ml-dtypes==0.4.1
111
+ mpmath==1.3.0
112
+ msgpack==1.1.0
113
+ mujoco==2.3.7
114
+ multidict==6.4.4
115
+ multiprocess==0.70.16
116
+ mypy-extensions==1.1.0
117
+ nest-asyncio==1.6.0
118
+ networkx==3.5
119
+ nodeenv==1.9.1
120
+ numba==0.61.2
121
+ numcodecs==0.16.1
122
+ numpy==1.26.4
123
+ numpydantic==1.6.9
124
+ nvidia-cublas-cu12==12.6.4.1
125
+ nvidia-cuda-cupti-cu12==12.6.80
126
+ nvidia-cuda-nvcc-cu12==12.9.41
127
+ nvidia-cuda-nvrtc-cu12==12.6.77
128
+ nvidia-cuda-runtime-cu12==12.6.77
129
+ nvidia-cudnn-cu12==9.5.1.17
130
+ nvidia-cufft-cu12==11.3.0.4
131
+ nvidia-cufile-cu12==1.11.1.6
132
+ nvidia-curand-cu12==10.3.7.77
133
+ nvidia-cusolver-cu12==11.7.1.2
134
+ nvidia-cusparse-cu12==12.5.4.2
135
+ nvidia-cusparselt-cu12==0.6.3
136
+ nvidia-ml-py==12.575.51
137
+ nvidia-nccl-cu12==2.26.2
138
+ nvidia-nvjitlink-cu12==12.6.85
139
+ nvidia-nvtx-cu12==12.6.77
140
+ oauthlib==3.2.2
141
+ omegaconf==2.3.0
142
+ opencv-python==4.11.0.86
143
+ opencv-python-headless==4.11.0.86
144
+ -e file:///workspace/pi05tests-openpi-multiarm/openpi
145
+ -e file:///workspace/pi05tests-openpi-multiarm/openpi/packages/openpi-client
146
+ opt-einsum==3.4.0
147
+ optax==0.2.4
148
+ orbax-checkpoint==0.11.13
149
+ orderly-set==5.4.1
150
+ packaging==25.0
151
+ pandas==2.2.3
152
+ parso==0.8.4
153
+ pexpect==4.9.0
154
+ pfzy==0.3.4
155
+ pillow==11.2.1
156
+ platformdirs==4.3.8
157
+ pluggy==1.6.0
158
+ polars==1.30.0
159
+ pre-commit==4.2.0
160
+ prompt-toolkit==3.0.51
161
+ propcache==0.3.1
162
+ proto-plus==1.26.1
163
+ protobuf==4.25.8
164
+ psutil==7.0.0
165
+ ptyprocess==0.7.0
166
+ pure-eval==0.2.3
167
+ pyarrow==20.0.0
168
+ pyasn1==0.6.1
169
+ pyasn1-modules==0.4.2
170
+ pycparser==2.22
171
+ pydantic==2.11.5
172
+ pydantic-core==2.33.2
173
+ pygments==2.19.1
174
+ pymunk==7.0.0
175
+ pynput==1.8.1
176
+ pynvml==12.0.0
177
+ pyopengl==3.1.9
178
+ pyparsing==3.2.3
179
+ pysocks==1.7.1
180
+ pytest==8.3.5
181
+ python-dateutil==2.9.0.post0
182
+ python-xlib==0.33
183
+ pytz==2025.2
184
+ pyyaml==6.0.2
185
+ pyyaml-include==1.4.1
186
+ pyzmq==26.4.0
187
+ regex==2024.11.6
188
+ requests==2.32.3
189
+ requests-oauthlib==2.0.0
190
+ rerun-sdk==0.23.1
191
+ rich==14.0.0
192
+ rsa==4.9.1
193
+ ruff==0.11.12
194
+ safetensors==0.5.3
195
+ scipy==1.15.3
196
+ sentencepiece==0.2.0
197
+ sentry-sdk==2.29.1
198
+ setproctitle==1.3.6
199
+ setuptools==80.9.0
200
+ shtab==1.7.2
201
+ simplejson==3.20.1
202
+ six==1.17.0
203
+ smmap==5.0.2
204
+ soupsieve==2.7
205
+ stack-data==0.6.3
206
+ svgwrite==1.4.3
207
+ sympy==1.14.0
208
+ tensorstore==0.1.74
209
+ termcolor==3.1.0
210
+ tokenizers==0.21.1
211
+ toml==0.10.2
212
+ toolz==1.0.0
213
+ torch==2.7.1
214
+ torchcodec==0.4.0
215
+ torchvision==0.22.1
216
+ tornado==6.5.1
217
+ tqdm==4.67.1
218
+ tqdm-loggable==0.2
219
+ traitlets==5.14.3
220
+ transformers==4.53.2
221
+ tree==0.2.4
222
+ treescope==0.1.9
223
+ triton==3.3.1
224
+ typeguard==4.4.2
225
+ typing-extensions==4.13.2
226
+ typing-inspect==0.9.0
227
+ typing-inspection==0.4.1
228
+ tyro==0.9.22
229
+ tzdata==2025.2
230
+ urllib3==2.4.0
231
+ virtualenv==20.31.2
232
+ wadler-lindig==0.1.6
233
+ wandb==0.19.11
234
+ wcwidth==0.2.13
235
+ websockets==15.0.1
236
+ werkzeug==3.1.3
237
+ widgetsnbextension==4.0.14
238
+ wrapt==1.14.1
239
+ xxhash==3.5.0
240
+ yarl==1.20.0
241
+ zarr==3.0.8
242
+ zipp==3.22.0
artifacts/twin_handover_packed_parallelization_20260309/environment/python_env.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:09:46Z
2
+ Python 3.11.10
3
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/python
4
+ /usr/local/bin/uv
5
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/python
6
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv
7
+
8
+ torch=2.7.1+cu126
9
+ cuda=12.6
10
+ cudnn=90501
11
+ huggingface_hub=0.32.3
artifacts/twin_handover_packed_parallelization_20260309/environment/selected_env_vars.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {}
artifacts/twin_handover_packed_parallelization_20260309/environment/system_info.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:08:36Z
2
+ hostname=9e9e564d5d6e
3
+ uname=Linux 9e9e564d5d6e 6.8.0-90-generic #91-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 18 14:14:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
4
+ python=Python 3.11.10
5
+ uv=uv 0.10.9
6
+ torch=2.7.1+cu126
7
+ huggingface_hub=0.32.3
artifacts/twin_handover_packed_parallelization_20260309/environment/workspace_snapshot.txt ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ timestamp_utc=2026-03-09T02:10:07Z
2
+
3
+ Top-level /workspace contents:
4
+ /workspace/.codex
5
+ /workspace/.hf
6
+ /workspace/.local
7
+ /workspace/bin
8
+ /workspace/checkpoints
9
+ /workspace/codex-env.sh
10
+ /workspace/lerobot
11
+ /workspace/openpi
12
+ /workspace/openpi_partial_broken_1773005128
13
+ /workspace/pi05tests-openpi-multiarm
14
+ /workspace/run_logs
15
+
16
+ Top-level packaged repo contents:
17
+ /workspace/pi05tests-openpi-multiarm/.cache
18
+ /workspace/pi05tests-openpi-multiarm/.cache/huggingface
19
+ /workspace/pi05tests-openpi-multiarm/artifacts
20
+ /workspace/pi05tests-openpi-multiarm/artifacts/pi05_base_params
21
+ /workspace/pi05tests-openpi-multiarm/artifacts/twin_handover_packed_parallelization_20260309
22
+ /workspace/pi05tests-openpi-multiarm/openpi
23
+ /workspace/pi05tests-openpi-multiarm/openpi/.dockerignore
24
+ /workspace/pi05tests-openpi-multiarm/openpi/.github
25
+ /workspace/pi05tests-openpi-multiarm/openpi/.gitignore
26
+ /workspace/pi05tests-openpi-multiarm/openpi/.gitmodules
27
+ /workspace/pi05tests-openpi-multiarm/openpi/.pre-commit-config.yaml
28
+ /workspace/pi05tests-openpi-multiarm/openpi/.python-version
29
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv
30
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv_partial_1773006322
31
+ /workspace/pi05tests-openpi-multiarm/openpi/.vscode
32
+ /workspace/pi05tests-openpi-multiarm/openpi/CONTRIBUTING.md
33
+ /workspace/pi05tests-openpi-multiarm/openpi/LICENSE
34
+ /workspace/pi05tests-openpi-multiarm/openpi/LICENSE_GEMMA.txt
35
+ /workspace/pi05tests-openpi-multiarm/openpi/README.md
36
+ /workspace/pi05tests-openpi-multiarm/openpi/assets
37
+ /workspace/pi05tests-openpi-multiarm/openpi/checkpoints
38
+ /workspace/pi05tests-openpi-multiarm/openpi/docs
39
+ /workspace/pi05tests-openpi-multiarm/openpi/examples
40
+ /workspace/pi05tests-openpi-multiarm/openpi/packages
41
+ /workspace/pi05tests-openpi-multiarm/openpi/pyproject.toml
42
+ /workspace/pi05tests-openpi-multiarm/openpi/scripts
43
+ /workspace/pi05tests-openpi-multiarm/openpi/src
44
+ /workspace/pi05tests-openpi-multiarm/openpi/uv.lock
45
+
46
+ Selected sizes:
47
+ 410G /workspace/pi05tests-openpi-multiarm
48
+ 2.9M /workspace/checkpoints/pi05_base_single_pytorch
49
+ 2.9M /workspace/checkpoints/pi05_base_parallel_packed_from_single
artifacts/twin_handover_packed_parallelization_20260309/metrics/norm_stats_verification.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ path=/workspace/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json
2
+ keys=[actions,state]
3
+ state_mean_len=16 state_std_len=16
4
+ action_mean_len=16 action_std_len=16
5
+ ---
6
+ path=/workspace/openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json
7
+ keys=[actions,state]
8
+ state_mean_len=16 state_std_len=16
9
+ action_mean_len=16 action_std_len=16
artifacts/twin_handover_packed_parallelization_20260309/metrics/summary.json ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "train": {
3
+ "baseline": {
4
+ "steps": {
5
+ "250": {
6
+ "loss": 0.1975,
7
+ "smoothed_loss": 0.1166,
8
+ "lr": "2.50e-05",
9
+ "grad_norm": 1.0523,
10
+ "max_cuda_memory": "35.23GB"
11
+ },
12
+ "500": {
13
+ "loss": 0.0606,
14
+ "smoothed_loss": 0.0554,
15
+ "lr": "2.35e-05",
16
+ "grad_norm": 1.021,
17
+ "max_cuda_memory": "35.23GB"
18
+ },
19
+ "1000": {
20
+ "loss": 0.0245,
21
+ "smoothed_loss": 0.0387,
22
+ "lr": "1.58e-05",
23
+ "grad_norm": 1.0163,
24
+ "max_cuda_memory": "35.23GB"
25
+ },
26
+ "1500": {
27
+ "loss": 0.0155,
28
+ "smoothed_loss": 0.0331,
29
+ "lr": "6.60e-06",
30
+ "grad_norm": 0.7702,
31
+ "max_cuda_memory": "35.23GB"
32
+ },
33
+ "2000": {
34
+ "loss": 0.0391,
35
+ "smoothed_loss": 0.0278,
36
+ "lr": "2.50e-06",
37
+ "grad_norm": 0.7445,
38
+ "max_cuda_memory": "35.23GB"
39
+ }
40
+ },
41
+ "startup": {
42
+ "config_name": "pi05_twin_handover_256_packed_baseline_pytorch_2k",
43
+ "dataset_repo_id": "lsnu/twin_handover_256_train",
44
+ "norm_stats_file": "/workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json",
45
+ "checkpoint_source": "/workspace/checkpoints/pi05_base_single_pytorch",
46
+ "model_type": "baseline",
47
+ "packed_transforms": "True",
48
+ "world_size": "4",
49
+ "batch_size": "local=4, global=16",
50
+ "num_workers": "8",
51
+ "precision": "bfloat16",
52
+ "lr_schedule": "warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06",
53
+ "save_log_intervals": "save_interval=250, log_interval=10",
54
+ "action_loss_mask": "(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",
55
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
56
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
57
+ "weight_missing_count": "0",
58
+ "weight_unexpected_count": "0",
59
+ "weight_missing_keys": "set()",
60
+ "weight_unexpected_keys": "[]"
61
+ },
62
+ "saves": {
63
+ "250": {
64
+ "timestamp": "00:25:28.986",
65
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/250"
66
+ },
67
+ "500": {
68
+ "timestamp": "00:29:40.355",
69
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/500"
70
+ },
71
+ "750": {
72
+ "timestamp": "00:35:01.426",
73
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/750"
74
+ },
75
+ "1000": {
76
+ "timestamp": "00:39:27.037",
77
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000"
78
+ },
79
+ "1250": {
80
+ "timestamp": "00:43:25.467",
81
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1250"
82
+ },
83
+ "1500": {
84
+ "timestamp": "00:47:39.593",
85
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1500"
86
+ },
87
+ "1750": {
88
+ "timestamp": "00:51:38.690",
89
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1750"
90
+ },
91
+ "2000": {
92
+ "timestamp": "00:55:30.655",
93
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000"
94
+ }
95
+ },
96
+ "runtime": "33:27"
97
+ },
98
+ "parallel": {
99
+ "steps": {
100
+ "250": {
101
+ "loss": 0.1894,
102
+ "smoothed_loss": 0.1153,
103
+ "lr": "2.50e-05",
104
+ "grad_norm": 1.0751,
105
+ "max_cuda_memory": "35.27GB"
106
+ },
107
+ "500": {
108
+ "loss": 0.0633,
109
+ "smoothed_loss": 0.0565,
110
+ "lr": "2.35e-05",
111
+ "grad_norm": 1.001,
112
+ "max_cuda_memory": "35.27GB"
113
+ },
114
+ "1000": {
115
+ "loss": 0.0214,
116
+ "smoothed_loss": 0.0392,
117
+ "lr": "1.58e-05",
118
+ "grad_norm": 0.9669,
119
+ "max_cuda_memory": "35.27GB"
120
+ },
121
+ "1500": {
122
+ "loss": 0.0155,
123
+ "smoothed_loss": 0.0331,
124
+ "lr": "6.60e-06",
125
+ "grad_norm": 0.7305,
126
+ "max_cuda_memory": "35.27GB"
127
+ },
128
+ "2000": {
129
+ "loss": 0.0326,
130
+ "smoothed_loss": 0.027,
131
+ "lr": "2.50e-06",
132
+ "grad_norm": 0.735,
133
+ "max_cuda_memory": "35.27GB"
134
+ }
135
+ },
136
+ "startup": {
137
+ "config_name": "pi05_twin_handover_256_packed_parallel_pytorch_2k",
138
+ "dataset_repo_id": "lsnu/twin_handover_256_train",
139
+ "norm_stats_file": "/workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json",
140
+ "checkpoint_source": "/workspace/checkpoints/pi05_base_parallel_packed_from_single",
141
+ "model_type": "parallel",
142
+ "packed_transforms": "True",
143
+ "world_size": "4",
144
+ "batch_size": "local=4, global=16",
145
+ "num_workers": "8",
146
+ "precision": "bfloat16",
147
+ "lr_schedule": "warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06",
148
+ "save_log_intervals": "save_interval=250, log_interval=10",
149
+ "action_loss_mask": "(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",
150
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
151
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
152
+ "weight_missing_count": "0",
153
+ "weight_unexpected_count": "0",
154
+ "weight_missing_keys": "set()",
155
+ "weight_unexpected_keys": "[]"
156
+ },
157
+ "saves": {
158
+ "250": {
159
+ "timestamp": "01:14:12.456",
160
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/250"
161
+ },
162
+ "500": {
163
+ "timestamp": "01:18:40.916",
164
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/500"
165
+ },
166
+ "750": {
167
+ "timestamp": "01:22:49.479",
168
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/750"
169
+ },
170
+ "1000": {
171
+ "timestamp": "01:26:47.884",
172
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000"
173
+ },
174
+ "1250": {
175
+ "timestamp": "01:30:56.356",
176
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1250"
177
+ },
178
+ "1500": {
179
+ "timestamp": "01:34:31.362",
180
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1500"
181
+ },
182
+ "1750": {
183
+ "timestamp": "01:38:21.550",
184
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1750"
185
+ },
186
+ "2000": {
187
+ "timestamp": "01:42:18.699",
188
+ "path": "/workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000"
189
+ }
190
+ },
191
+ "runtime": "30:38"
192
+ }
193
+ },
194
+ "val": {
195
+ "baseline_1000": {
196
+ "checkpoint_path": "/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000",
197
+ "repo_id_used": "lsnu/twin_handover_256_val",
198
+ "num_batches": 50,
199
+ "mean_val_loss": 0.052885,
200
+ "std_val_loss": 0.032533,
201
+ "timing": "mean=0.3108 std=0.1375 min=0.2230 max=1.1986",
202
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
203
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
204
+ "weight_loading_missing_keys": "[]",
205
+ "weight_loading_unexpected_keys": "[]"
206
+ },
207
+ "baseline_2000": {
208
+ "checkpoint_path": "/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000",
209
+ "repo_id_used": "lsnu/twin_handover_256_val",
210
+ "num_batches": 100,
211
+ "mean_val_loss": 0.035776,
212
+ "std_val_loss": 0.027648,
213
+ "timing": "mean=0.2587 std=0.1111 min=0.2224 max=1.2881",
214
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
215
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
216
+ "weight_loading_missing_keys": "[]",
217
+ "weight_loading_unexpected_keys": "[]"
218
+ },
219
+ "parallel_1000": {
220
+ "checkpoint_path": "/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000",
221
+ "repo_id_used": "lsnu/twin_handover_256_val",
222
+ "num_batches": 50,
223
+ "mean_val_loss": 0.051214,
224
+ "std_val_loss": 0.028985,
225
+ "timing": "mean=0.2468 std=0.0900 min=0.2211 max=0.8606",
226
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
227
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
228
+ "weight_loading_missing_keys": "[]",
229
+ "weight_loading_unexpected_keys": "[]"
230
+ },
231
+ "parallel_2000": {
232
+ "checkpoint_path": "/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000",
233
+ "repo_id_used": "lsnu/twin_handover_256_val",
234
+ "num_batches": 100,
235
+ "mean_val_loss": 0.03568,
236
+ "std_val_loss": 0.026077,
237
+ "timing": "mean=0.2366 std=0.0593 min=0.2215 max=0.8235",
238
+ "active_mask_dims": "[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]",
239
+ "masked_dims": "[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]",
240
+ "weight_loading_missing_keys": "[]",
241
+ "weight_loading_unexpected_keys": "[]"
242
+ }
243
+ },
244
+ "wallclock": {
245
+ "followup_start_utc": "2026-03-09 00:31:32 UTC",
246
+ "followup_end_utc": "2026-03-09 01:49:19 UTC",
247
+ "pipeline_wallclock_from_baseline_start_to_final_val": "01:32:29",
248
+ "followup_runner_wallclock": "01:17:47",
249
+ "baseline_train_runtime": "33:27",
250
+ "parallel_train_runtime": "30:38",
251
+ "baseline_val_1000_runtime": "00:05:14",
252
+ "baseline_val_2000_runtime": "00:05:19",
253
+ "parallel_val_1000_runtime": "00:03:23",
254
+ "parallel_val_2000_runtime": "00:03:33"
255
+ },
256
+ "changed_files": [
257
+ {
258
+ "path": "openpi/src/openpi/transforms.py",
259
+ "description": "added PackPerArmBlocks and UnpackPerArmBlocks for semantic TWIN per-arm block packing"
260
+ },
261
+ {
262
+ "path": "openpi/src/openpi/training/config.py",
263
+ "description": "added packed TWIN model transforms, action_loss_mask, and 2K baseline/parallel configs"
264
+ },
265
+ {
266
+ "path": "openpi/src/openpi/training/data_loader.py",
267
+ "description": "added set_epoch and local dataset mirror handling / loader startup fixes"
268
+ },
269
+ {
270
+ "path": "openpi/src/openpi/models/model.py",
271
+ "description": "made pi0_pytorch import lazy"
272
+ },
273
+ {
274
+ "path": "openpi/src/openpi/models/tokenizer.py",
275
+ "description": "made AutoProcessor import lazy"
276
+ },
277
+ {
278
+ "path": "openpi/src/openpi/models_pytorch/pi0_pytorch.py",
279
+ "description": "disabled unconditional sample_actions torch.compile by default"
280
+ },
281
+ {
282
+ "path": "openpi/scripts/train_pytorch.py",
283
+ "description": "added startup logging, masked action loss, debug logging, and DDP/startup fixes"
284
+ },
285
+ {
286
+ "path": "openpi/scripts/eval_twin_val_loss_pytorch.py",
287
+ "description": "added masked val loss evaluation with configurable batches/workers and startup prints"
288
+ },
289
+ {
290
+ "path": "openpi/scripts/init_parallel_pi05_from_single_pytorch.py",
291
+ "description": "added exact packed parallel warm-start initialization from single-head checkpoint"
292
+ },
293
+ {
294
+ "path": "openpi/scripts/inspect_twin_packed_batch.py",
295
+ "description": "added packed batch inspection / zero-padding verification"
296
+ },
297
+ {
298
+ "path": "openpi/scripts/run_twin_handover_packed_followup.sh",
299
+ "description": "added detached follow-up automation for val passes and parallel launch"
300
+ },
301
+ {
302
+ "path": "openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json",
303
+ "description": "copied handover train norm stats for packed baseline config"
304
+ },
305
+ {
306
+ "path": "openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json",
307
+ "description": "copied handover train norm stats for packed parallel config"
308
+ },
309
+ {
310
+ "path": "README.md",
311
+ "description": "new repo-level experiment summary for the uploaded artifact bundle"
312
+ },
313
+ {
314
+ "path": "REPORT.md",
315
+ "description": "new detailed experiment report tying outcomes to code and artifacts"
316
+ }
317
+ ]
318
+ }
artifacts/twin_handover_packed_parallelization_20260309/metrics/train_loss_table.csv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model,step,loss,smoothed_loss,lr,grad_norm,max_cuda_memory
2
+ baseline,250,0.1975,0.1166,2.50e-05,1.0523,35.23GB
3
+ baseline,500,0.0606,0.0554,2.35e-05,1.021,35.23GB
4
+ baseline,1000,0.0245,0.0387,1.58e-05,1.0163,35.23GB
5
+ baseline,1500,0.0155,0.0331,6.60e-06,0.7702,35.23GB
6
+ baseline,2000,0.0391,0.0278,2.50e-06,0.7445,35.23GB
7
+ parallel,250,0.1894,0.1153,2.50e-05,1.0751,35.27GB
8
+ parallel,500,0.0633,0.0565,2.35e-05,1.001,35.27GB
9
+ parallel,1000,0.0214,0.0392,1.58e-05,0.9669,35.27GB
10
+ parallel,1500,0.0155,0.0331,6.60e-06,0.7305,35.27GB
11
+ parallel,2000,0.0326,0.027,2.50e-06,0.735,35.27GB
artifacts/twin_handover_packed_parallelization_20260309/metrics/val_loss_table.csv ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ model,checkpoint_step,num_batches,mean_val_loss,std_val_loss,timing
2
+ baseline,1000,50,0.052885,0.032533,mean=0.3108 std=0.1375 min=0.2230 max=1.1986
3
+ baseline,2000,100,0.035776,0.027648,mean=0.2587 std=0.1111 min=0.2224 max=1.2881
4
+ parallel,1000,50,0.051214,0.028985,mean=0.2468 std=0.0900 min=0.2211 max=0.8606
5
+ parallel,2000,100,0.03568,0.026077,mean=0.2366 std=0.0593 min=0.2215 max=0.8235
artifacts/twin_handover_packed_parallelization_20260309/repro/changed_files.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ openpi/src/openpi/transforms.py added PackPerArmBlocks and UnpackPerArmBlocks for semantic TWIN per-arm block packing
2
+ openpi/src/openpi/training/config.py added packed TWIN model transforms, action_loss_mask, and 2K baseline/parallel configs
3
+ openpi/src/openpi/training/data_loader.py added set_epoch and local dataset mirror handling / loader startup fixes
4
+ openpi/src/openpi/models/model.py made pi0_pytorch import lazy
5
+ openpi/src/openpi/models/tokenizer.py made AutoProcessor import lazy
6
+ openpi/src/openpi/models_pytorch/pi0_pytorch.py disabled unconditional sample_actions torch.compile by default
7
+ openpi/scripts/train_pytorch.py added startup logging, masked action loss, debug logging, and DDP/startup fixes
8
+ openpi/scripts/eval_twin_val_loss_pytorch.py added masked val loss evaluation with configurable batches/workers and startup prints
9
+ openpi/scripts/init_parallel_pi05_from_single_pytorch.py added exact packed parallel warm-start initialization from single-head checkpoint
10
+ openpi/scripts/inspect_twin_packed_batch.py added packed batch inspection / zero-padding verification
11
+ openpi/scripts/run_twin_handover_packed_followup.sh added detached follow-up automation for val passes and parallel launch
12
+ openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json copied handover train norm stats for packed baseline config
13
+ openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json copied handover train norm stats for packed parallel config
14
+ README.md new repo-level experiment summary for the uploaded artifact bundle
15
+ REPORT.md new detailed experiment report tying outcomes to code and artifacts
artifacts/twin_handover_packed_parallelization_20260309/repro/checkpoint_locations.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ /workspace/checkpoints/pi05_base_single_pytorch
2
+ /workspace/checkpoints/pi05_base_parallel_packed_from_single
3
+ /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000
4
+ /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000
5
+ /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000
6
+ /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000
artifacts/twin_handover_packed_parallelization_20260309/repro/commands_reproduce.sh ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ export HF_HOME=/workspace/.hf
4
+ export HF_HUB_CACHE=/workspace/.hf/hub
5
+ export HF_DATASETS_CACHE=/workspace/.hf/datasets
6
+ export HUGGINGFACE_HUB_CACHE=/workspace/.hf/hub
7
+ export XDG_CACHE_HOME=/workspace/.cache
8
+ export OPENPI_LEROBOT_HOME=/workspace/lerobot
9
+ export OPENPI_TORCH_COMPILE_SAMPLE_ACTIONS=0
10
+ export TOKENIZERS_PARALLELISM=false
11
+ export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
12
+ cd /workspace/openpi
13
+ source .venv/bin/activate
14
+ python scripts/inspect_twin_packed_batch.py --config_name pi05_twin_handover_256_packed_baseline_pytorch_2k --repo_id lsnu/twin_handover_256_train
15
+ python examples/convert_jax_model_to_pytorch.py --checkpoint_dir /workspace/pi05tests-openpi-multiarm/artifacts/pi05_base_params --config_name pi05_twin_handover_256_packed_baseline_pytorch_2k --output_path /workspace/checkpoints/pi05_base_single_pytorch --precision bfloat16
16
+ python scripts/init_parallel_pi05_from_single_pytorch.py --single_ckpt /workspace/checkpoints/pi05_base_single_pytorch --config_name pi05_twin_handover_256_packed_parallel_pytorch_2k --output_path /workspace/checkpoints/pi05_base_parallel_packed_from_single
17
+ torchrun --standalone --nproc_per_node=4 scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k --overwrite
18
+ python scripts/eval_twin_val_loss_pytorch.py --config_name pi05_twin_handover_256_packed_baseline_pytorch_2k --checkpoint_dir /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000 --repo_id lsnu/twin_handover_256_val --num_batches 50 --num_workers 0
19
+ python scripts/eval_twin_val_loss_pytorch.py --config_name pi05_twin_handover_256_packed_baseline_pytorch_2k --checkpoint_dir /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000 --repo_id lsnu/twin_handover_256_val --num_batches 100 --num_workers 0
20
+ torchrun --standalone --nproc_per_node=4 scripts/train_pytorch.py pi05_twin_handover_256_packed_parallel_pytorch_2k --exp_name handover_packed_parallel_2k --overwrite
21
+ python scripts/eval_twin_val_loss_pytorch.py --config_name pi05_twin_handover_256_packed_parallel_pytorch_2k --checkpoint_dir /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000 --repo_id lsnu/twin_handover_256_val --num_batches 50 --num_workers 0
22
+ python scripts/eval_twin_val_loss_pytorch.py --config_name pi05_twin_handover_256_packed_parallel_pytorch_2k --checkpoint_dir /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000 --repo_id lsnu/twin_handover_256_val --num_batches 100 --num_workers 0
artifacts/twin_handover_packed_parallelization_20260309/run_logs/detach_test.log ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ hi
2
+ bye
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k.log ADDED
The diff for this file is too large to render. See raw diff
 
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_1000.log ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ starting_eval config=pi05_twin_handover_256_packed_baseline_pytorch_2k checkpoint=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000 repo_id=lsnu/twin_handover_256_val
2
+ eval_loader batch_size=16 num_batches=50 num_workers=0
3
+
4
+
5
+ weight_loading missing=0 unexpected=0 device=cuda:0
6
+ eval_batch=1 loss=0.031020 batch_time_s=1.1986
7
+ eval_batch=2 loss=0.016421 batch_time_s=0.2400
8
+ eval_batch=3 loss=0.019009 batch_time_s=0.2371
9
+ eval_batch=4 loss=0.058900 batch_time_s=0.2230
10
+ eval_batch=5 loss=0.039465 batch_time_s=0.2257
11
+ eval_batch=6 loss=0.061871 batch_time_s=0.3408
12
+ eval_batch=7 loss=0.039355 batch_time_s=0.2552
13
+ eval_batch=8 loss=0.013108 batch_time_s=0.3001
14
+ eval_batch=9 loss=0.037281 batch_time_s=0.3122
15
+ eval_batch=10 loss=0.062332 batch_time_s=0.2296
16
+ eval_batch=11 loss=0.026757 batch_time_s=0.2320
17
+ eval_batch=12 loss=0.043025 batch_time_s=0.2359
18
+ eval_batch=13 loss=0.047591 batch_time_s=0.2317
19
+ eval_batch=14 loss=0.046923 batch_time_s=0.2352
20
+ eval_batch=15 loss=0.048440 batch_time_s=0.3084
21
+ eval_batch=16 loss=0.074316 batch_time_s=0.2294
22
+ eval_batch=17 loss=0.068891 batch_time_s=0.2512
23
+ eval_batch=18 loss=0.053325 batch_time_s=0.3206
24
+ eval_batch=19 loss=0.035644 batch_time_s=0.3163
25
+ eval_batch=20 loss=0.025946 batch_time_s=0.2289
26
+ eval_batch=21 loss=0.048144 batch_time_s=0.2838
27
+ eval_batch=22 loss=0.081570 batch_time_s=0.3150
28
+ eval_batch=23 loss=0.062998 batch_time_s=0.3382
29
+ eval_batch=24 loss=0.078956 batch_time_s=0.3765
30
+ eval_batch=25 loss=0.045697 batch_time_s=0.3072
31
+ eval_batch=26 loss=0.020523 batch_time_s=0.2988
32
+ eval_batch=27 loss=0.035404 batch_time_s=0.3281
33
+ eval_batch=28 loss=0.039222 batch_time_s=0.3669
34
+ eval_batch=29 loss=0.053275 batch_time_s=0.3338
35
+ eval_batch=30 loss=0.053682 batch_time_s=0.2773
36
+ eval_batch=31 loss=0.124611 batch_time_s=0.3229
37
+ eval_batch=32 loss=0.093004 batch_time_s=0.3327
38
+ eval_batch=33 loss=0.100326 batch_time_s=0.3062
39
+ eval_batch=34 loss=0.068221 batch_time_s=0.5203
40
+ eval_batch=35 loss=0.067222 batch_time_s=0.3190
41
+ eval_batch=36 loss=0.047065 batch_time_s=0.2393
42
+ eval_batch=37 loss=0.019016 batch_time_s=0.2778
43
+ eval_batch=38 loss=0.048523 batch_time_s=0.3234
44
+ eval_batch=39 loss=0.075579 batch_time_s=0.2905
45
+ eval_batch=40 loss=0.049607 batch_time_s=0.2612
46
+ eval_batch=41 loss=0.047019 batch_time_s=0.3323
47
+ eval_batch=42 loss=0.035811 batch_time_s=0.3344
48
+ eval_batch=43 loss=0.021360 batch_time_s=0.3128
49
+ eval_batch=44 loss=0.019255 batch_time_s=0.2885
50
+ eval_batch=45 loss=0.022715 batch_time_s=0.3116
51
+ eval_batch=46 loss=0.024246 batch_time_s=0.3442
52
+ eval_batch=47 loss=0.077525 batch_time_s=0.2601
53
+ eval_batch=48 loss=0.207067 batch_time_s=0.3068
54
+ eval_batch=49 loss=0.033557 batch_time_s=0.2332
55
+ eval_batch=50 loss=0.093434 batch_time_s=0.2469
56
+ config_name: pi05_twin_handover_256_packed_baseline_pytorch_2k
57
+ checkpoint_path: /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000
58
+ repo_id_used: lsnu/twin_handover_256_val
59
+ num_batches: 50
60
+ mean_val_loss: 0.052885
61
+ std_val_loss: 0.032533
62
+ per_batch_timing_seconds: mean=0.3108 std=0.1375 min=0.2230 max=1.1986
63
+ active_mask_dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]
64
+ masked_dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]
65
+ weight_loading_missing_keys: []
66
+ weight_loading_unexpected_keys: []
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_baseline_2k_val_2000.log ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ starting_eval config=pi05_twin_handover_256_packed_baseline_pytorch_2k checkpoint=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000 repo_id=lsnu/twin_handover_256_val
2
+ eval_loader batch_size=16 num_batches=100 num_workers=0
3
+ weight_loading missing=0 unexpected=0 device=cuda:0
4
+ eval_batch=1 loss=0.019216 batch_time_s=1.2881
5
+ eval_batch=2 loss=0.013719 batch_time_s=0.2542
6
+ eval_batch=3 loss=0.012779 batch_time_s=0.2498
7
+ eval_batch=4 loss=0.026855 batch_time_s=0.2422
8
+ eval_batch=5 loss=0.023092 batch_time_s=0.2363
9
+ eval_batch=6 loss=0.063545 batch_time_s=0.2285
10
+ eval_batch=7 loss=0.035285 batch_time_s=0.2961
11
+ eval_batch=8 loss=0.014463 batch_time_s=0.2318
12
+ eval_batch=9 loss=0.029309 batch_time_s=0.2403
13
+ eval_batch=10 loss=0.043977 batch_time_s=0.2449
14
+ eval_batch=11 loss=0.024810 batch_time_s=0.2426
15
+ eval_batch=12 loss=0.031340 batch_time_s=0.2310
16
+ eval_batch=13 loss=0.038825 batch_time_s=0.3180
17
+ eval_batch=14 loss=0.036152 batch_time_s=0.2432
18
+ eval_batch=15 loss=0.034914 batch_time_s=0.3352
19
+ eval_batch=16 loss=0.053971 batch_time_s=0.2680
20
+ eval_batch=17 loss=0.031400 batch_time_s=0.2827
21
+ eval_batch=18 loss=0.040505 batch_time_s=0.2913
22
+ eval_batch=19 loss=0.016300 batch_time_s=0.2329
23
+ eval_batch=20 loss=0.023962 batch_time_s=0.2303
24
+ eval_batch=21 loss=0.034431 batch_time_s=0.2705
25
+ eval_batch=22 loss=0.056853 batch_time_s=0.2979
26
+ eval_batch=23 loss=0.038143 batch_time_s=0.2601
27
+ eval_batch=24 loss=0.075043 batch_time_s=0.3020
28
+ eval_batch=25 loss=0.058564 batch_time_s=0.5796
29
+ eval_batch=26 loss=0.032481 batch_time_s=0.2340
30
+ eval_batch=27 loss=0.035333 batch_time_s=0.2701
31
+ eval_batch=28 loss=0.042256 batch_time_s=0.3069
32
+ eval_batch=29 loss=0.067687 batch_time_s=0.2336
33
+ eval_batch=30 loss=0.048997 batch_time_s=0.2917
34
+ eval_batch=31 loss=0.119097 batch_time_s=0.2272
35
+ eval_batch=32 loss=0.060042 batch_time_s=0.2282
36
+ eval_batch=33 loss=0.058640 batch_time_s=0.2405
37
+ eval_batch=34 loss=0.062960 batch_time_s=0.2298
38
+ eval_batch=35 loss=0.052300 batch_time_s=0.2224
39
+ eval_batch=36 loss=0.036295 batch_time_s=0.2275
40
+ eval_batch=37 loss=0.025163 batch_time_s=0.2301
41
+ eval_batch=38 loss=0.032151 batch_time_s=0.2865
42
+ eval_batch=39 loss=0.052523 batch_time_s=0.2395
43
+ eval_batch=40 loss=0.017417 batch_time_s=0.2338
44
+ eval_batch=41 loss=0.028829 batch_time_s=0.2308
45
+ eval_batch=42 loss=0.031216 batch_time_s=0.2330
46
+ eval_batch=43 loss=0.005192 batch_time_s=0.2345
47
+ eval_batch=44 loss=0.011528 batch_time_s=0.2308
48
+ eval_batch=45 loss=0.046379 batch_time_s=0.2311
49
+ eval_batch=46 loss=0.026113 batch_time_s=0.2280
50
+ eval_batch=47 loss=0.093653 batch_time_s=0.2313
51
+ eval_batch=48 loss=0.219696 batch_time_s=0.2301
52
+ eval_batch=49 loss=0.021639 batch_time_s=0.2477
53
+ eval_batch=50 loss=0.062274 batch_time_s=0.2299
54
+ eval_batch=51 loss=0.043294 batch_time_s=0.2282
55
+ eval_batch=52 loss=0.020800 batch_time_s=0.2402
56
+ eval_batch=53 loss=0.017962 batch_time_s=0.2315
57
+ eval_batch=54 loss=0.011119 batch_time_s=0.2258
58
+ eval_batch=55 loss=0.022601 batch_time_s=0.2330
59
+ eval_batch=56 loss=0.063293 batch_time_s=0.2378
60
+ eval_batch=57 loss=0.033958 batch_time_s=0.2375
61
+ eval_batch=58 loss=0.025469 batch_time_s=0.2294
62
+ eval_batch=59 loss=0.019972 batch_time_s=0.2376
63
+ eval_batch=60 loss=0.004765 batch_time_s=0.2354
64
+ eval_batch=61 loss=0.014635 batch_time_s=0.2449
65
+ eval_batch=62 loss=0.006239 batch_time_s=0.2288
66
+ eval_batch=63 loss=0.041332 batch_time_s=0.2520
67
+ eval_batch=64 loss=0.016763 batch_time_s=0.2517
68
+ eval_batch=65 loss=0.028758 batch_time_s=0.2447
69
+ eval_batch=66 loss=0.026301 batch_time_s=0.2312
70
+ eval_batch=67 loss=0.014657 batch_time_s=0.2353
71
+ eval_batch=68 loss=0.043065 batch_time_s=0.2276
72
+ eval_batch=69 loss=0.048954 batch_time_s=0.2282
73
+ eval_batch=70 loss=0.047917 batch_time_s=0.2359
74
+ eval_batch=71 loss=0.013441 batch_time_s=0.2318
75
+ eval_batch=72 loss=0.023035 batch_time_s=0.2453
76
+ eval_batch=73 loss=0.024245 batch_time_s=0.2530
77
+ eval_batch=74 loss=0.021810 batch_time_s=0.2387
78
+ eval_batch=75 loss=0.016290 batch_time_s=0.2281
79
+ eval_batch=76 loss=0.019809 batch_time_s=0.2320
80
+ eval_batch=77 loss=0.016700 batch_time_s=0.2462
81
+ eval_batch=78 loss=0.049874 batch_time_s=0.2369
82
+ eval_batch=79 loss=0.065255 batch_time_s=0.2548
83
+ eval_batch=80 loss=0.077142 batch_time_s=0.2906
84
+ eval_batch=81 loss=0.059736 batch_time_s=0.3057
85
+ eval_batch=82 loss=0.011131 batch_time_s=0.2359
86
+ eval_batch=83 loss=0.016865 batch_time_s=0.2454
87
+ eval_batch=84 loss=0.007890 batch_time_s=0.2386
88
+ eval_batch=85 loss=0.044606 batch_time_s=0.2352
89
+ eval_batch=86 loss=0.014035 batch_time_s=0.2365
90
+ eval_batch=87 loss=0.020954 batch_time_s=0.2419
91
+ eval_batch=88 loss=0.042758 batch_time_s=0.2262
92
+ eval_batch=89 loss=0.019468 batch_time_s=0.2352
93
+ eval_batch=90 loss=0.004773 batch_time_s=0.2292
94
+ eval_batch=91 loss=0.005070 batch_time_s=0.2296
95
+ eval_batch=92 loss=0.007161 batch_time_s=0.2291
96
+ eval_batch=93 loss=0.026996 batch_time_s=0.2361
97
+ eval_batch=94 loss=0.011121 batch_time_s=0.2456
98
+ eval_batch=95 loss=0.041840 batch_time_s=0.2409
99
+ eval_batch=96 loss=0.054416 batch_time_s=0.2333
100
+ eval_batch=97 loss=0.024979 batch_time_s=0.2276
101
+ eval_batch=98 loss=0.062096 batch_time_s=0.2403
102
+ eval_batch=99 loss=0.032598 batch_time_s=0.2326
103
+ eval_batch=100 loss=0.022353 batch_time_s=0.2274
104
+ config_name: pi05_twin_handover_256_packed_baseline_pytorch_2k
105
+ checkpoint_path: /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000
106
+ repo_id_used: lsnu/twin_handover_256_val
107
+ num_batches: 100
108
+ mean_val_loss: 0.035776
109
+ std_val_loss: 0.027648
110
+ per_batch_timing_seconds: mean=0.2587 std=0.1111 min=0.2224 max=1.2881
111
+ active_mask_dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]
112
+ masked_dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]
113
+ weight_loading_missing_keys: []
114
+ weight_loading_unexpected_keys: []
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k.log ADDED
The diff for this file is too large to render. See raw diff
 
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_1000.log ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ starting_eval config=pi05_twin_handover_256_packed_parallel_pytorch_2k checkpoint=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000 repo_id=lsnu/twin_handover_256_val
2
+ eval_loader batch_size=16 num_batches=50 num_workers=0
3
+ weight_loading missing=0 unexpected=0 device=cuda:0
4
+ eval_batch=1 loss=0.039282 batch_time_s=0.8606
5
+ eval_batch=2 loss=0.059935 batch_time_s=0.2233
6
+ eval_batch=3 loss=0.029645 batch_time_s=0.2237
7
+ eval_batch=4 loss=0.030436 batch_time_s=0.2312
8
+ eval_batch=5 loss=0.029398 batch_time_s=0.2255
9
+ eval_batch=6 loss=0.046098 batch_time_s=0.2291
10
+ eval_batch=7 loss=0.031397 batch_time_s=0.2243
11
+ eval_batch=8 loss=0.013987 batch_time_s=0.2256
12
+ eval_batch=9 loss=0.046950 batch_time_s=0.3194
13
+ eval_batch=10 loss=0.055185 batch_time_s=0.2211
14
+ eval_batch=11 loss=0.045538 batch_time_s=0.2270
15
+ eval_batch=12 loss=0.034314 batch_time_s=0.2221
16
+ eval_batch=13 loss=0.053436 batch_time_s=0.2306
17
+ eval_batch=14 loss=0.048917 batch_time_s=0.2322
18
+ eval_batch=15 loss=0.059734 batch_time_s=0.2346
19
+ eval_batch=16 loss=0.072608 batch_time_s=0.2275
20
+ eval_batch=17 loss=0.071442 batch_time_s=0.2257
21
+ eval_batch=18 loss=0.056916 batch_time_s=0.2247
22
+ eval_batch=19 loss=0.025555 batch_time_s=0.2238
23
+ eval_batch=20 loss=0.031001 batch_time_s=0.2557
24
+ eval_batch=21 loss=0.054189 batch_time_s=0.2259
25
+ eval_batch=22 loss=0.046724 batch_time_s=0.2544
26
+ eval_batch=23 loss=0.048790 batch_time_s=0.2389
27
+ eval_batch=24 loss=0.073533 batch_time_s=0.2283
28
+ eval_batch=25 loss=0.060645 batch_time_s=0.2387
29
+ eval_batch=26 loss=0.020740 batch_time_s=0.2323
30
+ eval_batch=27 loss=0.027174 batch_time_s=0.2226
31
+ eval_batch=28 loss=0.030402 batch_time_s=0.2211
32
+ eval_batch=29 loss=0.037136 batch_time_s=0.2303
33
+ eval_batch=30 loss=0.057298 batch_time_s=0.2221
34
+ eval_batch=31 loss=0.133256 batch_time_s=0.2228
35
+ eval_batch=32 loss=0.081425 batch_time_s=0.2285
36
+ eval_batch=33 loss=0.101147 batch_time_s=0.2291
37
+ eval_batch=34 loss=0.084155 batch_time_s=0.2763
38
+ eval_batch=35 loss=0.050369 batch_time_s=0.2300
39
+ eval_batch=36 loss=0.037849 batch_time_s=0.2228
40
+ eval_batch=37 loss=0.016911 batch_time_s=0.2211
41
+ eval_batch=38 loss=0.035706 batch_time_s=0.2215
42
+ eval_batch=39 loss=0.074094 batch_time_s=0.2247
43
+ eval_batch=40 loss=0.031583 batch_time_s=0.2256
44
+ eval_batch=41 loss=0.063281 batch_time_s=0.2345
45
+ eval_batch=42 loss=0.034781 batch_time_s=0.2247
46
+ eval_batch=43 loss=0.021991 batch_time_s=0.3036
47
+ eval_batch=44 loss=0.006788 batch_time_s=0.2310
48
+ eval_batch=45 loss=0.029891 batch_time_s=0.2888
49
+ eval_batch=46 loss=0.024711 batch_time_s=0.2320
50
+ eval_batch=47 loss=0.139781 batch_time_s=0.2281
51
+ eval_batch=48 loss=0.129609 batch_time_s=0.2421
52
+ eval_batch=49 loss=0.039653 batch_time_s=0.2222
53
+ eval_batch=50 loss=0.085291 batch_time_s=0.2304
54
+ config_name: pi05_twin_handover_256_packed_parallel_pytorch_2k
55
+ checkpoint_path: /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000
56
+ repo_id_used: lsnu/twin_handover_256_val
57
+ num_batches: 50
58
+ mean_val_loss: 0.051214
59
+ std_val_loss: 0.028985
60
+ per_batch_timing_seconds: mean=0.2468 std=0.0900 min=0.2211 max=0.8606
61
+ active_mask_dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]
62
+ masked_dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]
63
+ weight_loading_missing_keys: []
64
+ weight_loading_unexpected_keys: []
artifacts/twin_handover_packed_parallelization_20260309/run_logs/handover_packed_parallel_2k_val_2000.log ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ starting_eval config=pi05_twin_handover_256_packed_parallel_pytorch_2k checkpoint=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000 repo_id=lsnu/twin_handover_256_val
2
+ eval_loader batch_size=16 num_batches=100 num_workers=0
3
+ weight_loading missing=0 unexpected=0 device=cuda:0
4
+ eval_batch=1 loss=0.019788 batch_time_s=0.8235
5
+ eval_batch=2 loss=0.010034 batch_time_s=0.2312
6
+ eval_batch=3 loss=0.006535 batch_time_s=0.2283
7
+ eval_batch=4 loss=0.019442 batch_time_s=0.2249
8
+ eval_batch=5 loss=0.023646 batch_time_s=0.2275
9
+ eval_batch=6 loss=0.045010 batch_time_s=0.2273
10
+ eval_batch=7 loss=0.021796 batch_time_s=0.2327
11
+ eval_batch=8 loss=0.019273 batch_time_s=0.2319
12
+ eval_batch=9 loss=0.021624 batch_time_s=0.2248
13
+ eval_batch=10 loss=0.035467 batch_time_s=0.2359
14
+ eval_batch=11 loss=0.034351 batch_time_s=0.2552
15
+ eval_batch=12 loss=0.027341 batch_time_s=0.2308
16
+ eval_batch=13 loss=0.047439 batch_time_s=0.2257
17
+ eval_batch=14 loss=0.037939 batch_time_s=0.2329
18
+ eval_batch=15 loss=0.043057 batch_time_s=0.2215
19
+ eval_batch=16 loss=0.038503 batch_time_s=0.2317
20
+ eval_batch=17 loss=0.043592 batch_time_s=0.2290
21
+ eval_batch=18 loss=0.037270 batch_time_s=0.2265
22
+ eval_batch=19 loss=0.020304 batch_time_s=0.2329
23
+ eval_batch=20 loss=0.030268 batch_time_s=0.2234
24
+ eval_batch=21 loss=0.041346 batch_time_s=0.2263
25
+ eval_batch=22 loss=0.028159 batch_time_s=0.2268
26
+ eval_batch=23 loss=0.065991 batch_time_s=0.2251
27
+ eval_batch=24 loss=0.064603 batch_time_s=0.2268
28
+ eval_batch=25 loss=0.068628 batch_time_s=0.2282
29
+ eval_batch=26 loss=0.023403 batch_time_s=0.2302
30
+ eval_batch=27 loss=0.031110 batch_time_s=0.2274
31
+ eval_batch=28 loss=0.022352 batch_time_s=0.2289
32
+ eval_batch=29 loss=0.046446 batch_time_s=0.2292
33
+ eval_batch=30 loss=0.043246 batch_time_s=0.2321
34
+ eval_batch=31 loss=0.101922 batch_time_s=0.2274
35
+ eval_batch=32 loss=0.072581 batch_time_s=0.2300
36
+ eval_batch=33 loss=0.056358 batch_time_s=0.2252
37
+ eval_batch=34 loss=0.065017 batch_time_s=0.2306
38
+ eval_batch=35 loss=0.048672 batch_time_s=0.2388
39
+ eval_batch=36 loss=0.022249 batch_time_s=0.2322
40
+ eval_batch=37 loss=0.014201 batch_time_s=0.2266
41
+ eval_batch=38 loss=0.039009 batch_time_s=0.2261
42
+ eval_batch=39 loss=0.033967 batch_time_s=0.2303
43
+ eval_batch=40 loss=0.021915 batch_time_s=0.2462
44
+ eval_batch=41 loss=0.024328 batch_time_s=0.2613
45
+ eval_batch=42 loss=0.050496 batch_time_s=0.2354
46
+ eval_batch=43 loss=0.010375 batch_time_s=0.2300
47
+ eval_batch=44 loss=0.016967 batch_time_s=0.2276
48
+ eval_batch=45 loss=0.026333 batch_time_s=0.2552
49
+ eval_batch=46 loss=0.019980 batch_time_s=0.2267
50
+ eval_batch=47 loss=0.089578 batch_time_s=0.2327
51
+ eval_batch=48 loss=0.209416 batch_time_s=0.2445
52
+ eval_batch=49 loss=0.011339 batch_time_s=0.2359
53
+ eval_batch=50 loss=0.066028 batch_time_s=0.2251
54
+ eval_batch=51 loss=0.035093 batch_time_s=0.2288
55
+ eval_batch=52 loss=0.020534 batch_time_s=0.2276
56
+ eval_batch=53 loss=0.006331 batch_time_s=0.2313
57
+ eval_batch=54 loss=0.012782 batch_time_s=0.2247
58
+ eval_batch=55 loss=0.022509 batch_time_s=0.2299
59
+ eval_batch=56 loss=0.047079 batch_time_s=0.2317
60
+ eval_batch=57 loss=0.023989 batch_time_s=0.2302
61
+ eval_batch=58 loss=0.019615 batch_time_s=0.2322
62
+ eval_batch=59 loss=0.026347 batch_time_s=0.2346
63
+ eval_batch=60 loss=0.004678 batch_time_s=0.2323
64
+ eval_batch=61 loss=0.007068 batch_time_s=0.2324
65
+ eval_batch=62 loss=0.013162 batch_time_s=0.2336
66
+ eval_batch=63 loss=0.047115 batch_time_s=0.2236
67
+ eval_batch=64 loss=0.017077 batch_time_s=0.2299
68
+ eval_batch=65 loss=0.047049 batch_time_s=0.2288
69
+ eval_batch=66 loss=0.035518 batch_time_s=0.2257
70
+ eval_batch=67 loss=0.016819 batch_time_s=0.2306
71
+ eval_batch=68 loss=0.051586 batch_time_s=0.2215
72
+ eval_batch=69 loss=0.043497 batch_time_s=0.2312
73
+ eval_batch=70 loss=0.072536 batch_time_s=0.2301
74
+ eval_batch=71 loss=0.018621 batch_time_s=0.2365
75
+ eval_batch=72 loss=0.043862 batch_time_s=0.2305
76
+ eval_batch=73 loss=0.034882 batch_time_s=0.2314
77
+ eval_batch=74 loss=0.028771 batch_time_s=0.2286
78
+ eval_batch=75 loss=0.012547 batch_time_s=0.2269
79
+ eval_batch=76 loss=0.023966 batch_time_s=0.2317
80
+ eval_batch=77 loss=0.023444 batch_time_s=0.2290
81
+ eval_batch=78 loss=0.048585 batch_time_s=0.2343
82
+ eval_batch=79 loss=0.065904 batch_time_s=0.2264
83
+ eval_batch=80 loss=0.072660 batch_time_s=0.2255
84
+ eval_batch=81 loss=0.038694 batch_time_s=0.2281
85
+ eval_batch=82 loss=0.013027 batch_time_s=0.2302
86
+ eval_batch=83 loss=0.022540 batch_time_s=0.2336
87
+ eval_batch=84 loss=0.010291 batch_time_s=0.2216
88
+ eval_batch=85 loss=0.054119 batch_time_s=0.2286
89
+ eval_batch=86 loss=0.021808 batch_time_s=0.2305
90
+ eval_batch=87 loss=0.018521 batch_time_s=0.2330
91
+ eval_batch=88 loss=0.042638 batch_time_s=0.2329
92
+ eval_batch=89 loss=0.023391 batch_time_s=0.2352
93
+ eval_batch=90 loss=0.004995 batch_time_s=0.2289
94
+ eval_batch=91 loss=0.006358 batch_time_s=0.2311
95
+ eval_batch=92 loss=0.024077 batch_time_s=0.2306
96
+ eval_batch=93 loss=0.039791 batch_time_s=0.2334
97
+ eval_batch=94 loss=0.046554 batch_time_s=0.2327
98
+ eval_batch=95 loss=0.038985 batch_time_s=0.2279
99
+ eval_batch=96 loss=0.034484 batch_time_s=0.2243
100
+ eval_batch=97 loss=0.037144 batch_time_s=0.2285
101
+ eval_batch=98 loss=0.069108 batch_time_s=0.2318
102
+ eval_batch=99 loss=0.035033 batch_time_s=0.2335
103
+ eval_batch=100 loss=0.024118 batch_time_s=0.2258
104
+ config_name: pi05_twin_handover_256_packed_parallel_pytorch_2k
105
+ checkpoint_path: /workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000
106
+ repo_id_used: lsnu/twin_handover_256_val
107
+ num_batches: 100
108
+ mean_val_loss: 0.035680
109
+ std_val_loss: 0.026077
110
+ per_batch_timing_seconds: mean=0.2366 std=0.0593 min=0.2215 max=0.8235
111
+ active_mask_dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23]
112
+ masked_dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31]
113
+ weight_loading_missing_keys: []
114
+ weight_loading_unexpected_keys: []
artifacts/twin_handover_packed_parallelization_20260309/run_logs/importtime_train_pytorch.log ADDED
@@ -0,0 +1,349 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time: self [us] | cumulative | imported package
2
+ import time: 459 | 459 | _io
3
+ import time: 100 | 100 | marshal
4
+ import time: 1005 | 1005 | posix
5
+ import time: 2124 | 3687 | _frozen_importlib_external
6
+ import time: 521 | 521 | time
7
+ import time: 542 | 1062 | zipimport
8
+ import time: 126 | 126 | _codecs
9
+ import time: 1309 | 1435 | codecs
10
+ import time: 1112 | 1112 | encodings.aliases
11
+ import time: 2172 | 4718 | encodings
12
+ import time: 579 | 579 | encodings.utf_8
13
+ import time: 307 | 307 | _signal
14
+ import time: 100 | 100 | _abc
15
+ import time: 561 | 660 | abc
16
+ import time: 825 | 1484 | io
17
+ import time: 115 | 115 | _stat
18
+ import time: 578 | 693 | stat
19
+ import time: 2208 | 2208 | _collections_abc
20
+ import time: 104 | 104 | genericpath
21
+ import time: 550 | 653 | posixpath
22
+ import time: 1911 | 5463 | os
23
+ import time: 177 | 177 | _sitebuiltins
24
+ import time: 27305 | 27305 | _virtualenv
25
+ import time: 29406 | 29406 | _distutils_hack
26
+ import time: 427 | 427 | sitecustomize
27
+ import time: 150941 | 213717 | site
28
+ import time: 44311 | 44311 | scripts
29
+ import time: 2030 | 2030 | types
30
+ import time: 257 | 257 | _operator
31
+ import time: 2479 | 2735 | operator
32
+ import time: 339 | 339 | itertools
33
+ import time: 1371 | 1371 | keyword
34
+ import time: 1450 | 1450 | reprlib
35
+ import time: 191 | 191 | _collections
36
+ import time: 5447 | 8797 | collections
37
+ import time: 173 | 173 | _functools
38
+ import time: 2662 | 11631 | functools
39
+ import time: 8546 | 24941 | enum
40
+ import time: 219 | 219 | _sre
41
+ import time: 791 | 791 | re._constants
42
+ import time: 1361 | 2151 | re._parser
43
+ import time: 365 | 365 | re._casefix
44
+ import time: 2178 | 4912 | re._compiler
45
+ import time: 1533 | 1533 | copyreg
46
+ import time: 3330 | 34714 | re
47
+ import time: 1611 | 1611 | _weakrefset
48
+ import time: 3100 | 4710 | weakref
49
+ import time: 9576 | 9576 | org
50
+ import time: 255 | 9830 | org.python
51
+ import time: 224 | 10053 | org.python.core
52
+ import time: 2268 | 17030 | copy
53
+ import time: 3210 | 3210 | _ast
54
+ import time: 4811 | 4811 | contextlib
55
+ import time: 7411 | 15431 | ast
56
+ import time: 164 | 164 | _opcode
57
+ import time: 4583 | 4747 | opcode
58
+ import time: 5950 | 10696 | dis
59
+ import time: 612 | 612 | collections.abc
60
+ import time: 2739 | 2739 | warnings
61
+ import time: 2281 | 5019 | importlib
62
+ import time: 368 | 5387 | importlib.machinery
63
+ import time: 3055 | 3055 | token
64
+ import time: 6195 | 9250 | tokenize
65
+ import time: 2342 | 11591 | linecache
66
+ import time: 8114 | 51829 | inspect
67
+ import time: 4396 | 107967 | dataclasses
68
+ import time: 186 | 186 | gc
69
+ import time: 4839 | 4839 | textwrap
70
+ import time: 3120 | 7958 | traceback
71
+ import time: 130 | 130 | _string
72
+ import time: 3121 | 3250 | string
73
+ import time: 4201 | 4201 | threading
74
+ import time: 127 | 127 | atexit
75
+ import time: 7952 | 23486 | logging
76
+ import time: 6543 | 6543 | platform
77
+ import time: 2295 | 2295 | fnmatch
78
+ import time: 287 | 287 | errno
79
+ import time: 336 | 336 | zlib
80
+ import time: 4029 | 4029 | _compression
81
+ import time: 2862 | 2862 | _bz2
82
+ import time: 4040 | 10930 | bz2
83
+ import time: 4869 | 4869 | _lzma
84
+ import time: 4931 | 9800 | lzma
85
+ import time: 6201 | 29847 | shutil
86
+ import time: 4466 | 4466 | __future__
87
+ import time: 222 | 222 | math
88
+ import time: 341 | 341 | _datetime
89
+ import time: 9607 | 10169 | datetime
90
+ import time: 8169 | 8169 | _winapi
91
+ import time: 11261 | 11261 | nt
92
+ import time: 9784 | 9784 | nt
93
+ import time: 8028 | 8028 | nt
94
+ import time: 10836 | 10836 | nt
95
+ import time: 8338 | 8338 | nt
96
+ import time: 3115 | 59529 | ntpath
97
+ import time: 3503 | 3503 | urllib
98
+ import time: 7606 | 7606 | ipaddress
99
+ import time: 3508 | 14616 | urllib.parse
100
+ import time: 7032 | 81177 | pathlib
101
+ import time: 279 | 279 | _locale
102
+ import time: 7805 | 8083 | locale
103
+ import time: 5367 | 5367 | signal
104
+ import time: 213 | 213 | fcntl
105
+ import time: 9064 | 9064 | msvcrt
106
+ import time: 169 | 169 | _posixsubprocess
107
+ import time: 236 | 236 | select
108
+ import time: 6301 | 6301 | selectors
109
+ import time: 11615 | 41045 | subprocess
110
+ import time: 42021 | 178875 | jax.version
111
+ import time: 56236 | 56236 | jax._src
112
+ import time: 8786 | 8786 | _typing
113
+ import time: 13264 | 22049 | typing
114
+ import time: 58851 | 58851 | jaxlib.version
115
+ import time: 95721 | 154572 | jaxlib
116
+ import time: 21162 | 21162 | jaxlib.cpu_feature_guard
117
+ import time: 17965 | 17965 | jaxlib.utils
118
+ import time: 357 | 357 | _struct
119
+ import time: 1673 | 2029 | struct
120
+ import time: 12406 | 14434 | gzip
121
+ import time: 72050 | 72050 | numpy._utils._convertions
122
+ import time: 90873 | 162922 | numpy._utils
123
+ import time: 58590 | 221512 | numpy._globals
124
+ import time: 53224 | 53224 | numpy.exceptions
125
+ import time: 52905 | 52905 | numpy.version
126
+ import time: 609 | 609 | numpy._distributor_init_local
127
+ import time: 55586 | 56194 | numpy._distributor_init
128
+ import time: 35135 | 35135 | numpy._utils._inspect
129
+ import time: 12046 | 12046 | numpy.core._exceptions
130
+ import time: 8271 | 8271 | numpy.dtypes
131
+ import time: 202114 | 222430 | numpy.core._multiarray_umath
132
+ import time: 37402 | 294966 | numpy.core.overrides
133
+ import time: 53534 | 348500 | numpy.core.multiarray
134
+ import time: 5565 | 5565 | numpy.core.umath
135
+ import time: 1567 | 1567 | numbers
136
+ import time: 8550 | 8550 | numpy.core._string_helpers
137
+ import time: 8226 | 8226 | pickle5
138
+ import time: 1179 | 1179 | _compat_pickle
139
+ import time: 512 | 512 | _pickle
140
+ import time: 1864 | 1864 | org
141
+ import time: 329 | 2192 | org.python
142
+ import time: 776 | 2968 | org.python.core
143
+ import time: 4888 | 9545 | pickle
144
+ import time: 10531 | 28301 | numpy.compat.py3k
145
+ import time: 25123 | 53423 | numpy.compat
146
+ import time: 7502 | 7502 | numpy.core._dtype
147
+ import time: 8090 | 69014 | numpy.core._type_aliases
148
+ import time: 6044 | 85174 | numpy.core.numerictypes
149
+ import time: 646 | 646 | _contextvars
150
+ import time: 805 | 1451 | contextvars
151
+ import time: 7510 | 8961 | numpy.core._ufunc_config
152
+ import time: 21816 | 30776 | numpy.core._methods
153
+ import time: 10769 | 41545 | numpy.core.fromnumeric
154
+ import time: 7787 | 49331 | numpy.core.shape_base
155
+ import time: 6325 | 6325 | numpy.core.arrayprint
156
+ import time: 4369 | 4369 | numpy.core._asarray
157
+ import time: 10436 | 70460 | numpy.core.numeric
158
+ import time: 5458 | 5458 | numpy.core.defchararray
159
+ import time: 6653 | 6653 | numpy.core.records
160
+ import time: 2659 | 2659 | numpy.core.memmap
161
+ import time: 3430 | 3430 | numpy.core.function_base
162
+ import time: 3739 | 3739 | numpy.core._machar
163
+ import time: 4821 | 4821 | numpy.core.getlimits
164
+ import time: 5141 | 5141 | numpy.core.einsumfunc
165
+ import time: 2892 | 2892 | numpy.core._multiarray_tests
166
+ import time: 7349 | 10241 | numpy.core._add_newdocs
167
+ import time: 10209 | 10209 | numpy.core._add_newdocs_scalars
168
+ import time: 4958 | 4958 | numpy.core._dtype_ctypes
169
+ import time: 1331 | 1331 | _ctypes
170
+ import time: 1038 | 1038 | ctypes._endian
171
+ import time: 3302 | 5670 | ctypes
172
+ import time: 8903 | 14573 | numpy.core._internal
173
+ import time: 7543 | 7543 | numpy._pytesttester
174
+ import time: 81885 | 671000 | numpy.core
175
+ import time: 153 | 671152 | numpy.core._multiarray_umath
176
+ import time: 56994 | 728146 | numpy.__config__
177
+ import time: 7653 | 7653 | numpy.lib.mixins
178
+ import time: 9676 | 9676 | numpy.lib.ufunclike
179
+ import time: 7766 | 17441 | numpy.lib.type_check
180
+ import time: 10010 | 27450 | numpy.lib.scimath
181
+ import time: 22351 | 22351 | numpy.lib.stride_tricks
182
+ import time: 11303 | 33654 | numpy.lib.twodim_base
183
+ import time: 8761 | 8761 | numpy.linalg._umath_linalg
184
+ import time: 16569 | 16569 | numpy._typing._nested_sequence
185
+ import time: 13982 | 13982 | numpy._typing._nbit
186
+ import time: 20263 | 20263 | numpy._typing._char_codes
187
+ import time: 11700 | 11700 | numpy._typing._scalars
188
+ import time: 8982 | 8982 | numpy._typing._shape
189
+ import time: 24532 | 24532 | numpy._typing._dtype_like
190
+ import time: 44660 | 44660 | numpy._typing._array_like
191
+ import time: 29866 | 170550 | numpy._typing
192
+ import time: 17677 | 230640 | numpy.linalg.linalg
193
+ import time: 237805 | 468444 | numpy.linalg
194
+ import time: 9029 | 477473 | numpy.matrixlib.defmatrix
195
+ import time: 10944 | 488417 | numpy.matrixlib
196
+ import time: 8745 | 8745 | numpy.lib.histograms
197
+ import time: 27873 | 36617 | numpy.lib.function_base
198
+ import time: 17216 | 542249 | numpy.lib.index_tricks
199
+ import time: 16518 | 16518 | numpy.lib.nanfunctions
200
+ import time: 14925 | 14925 | numpy.lib.shape_base
201
+ import time: 8883 | 8883 | numpy.lib.polynomial
202
+ import time: 13341 | 13341 | numpy.lib.utils
203
+ import time: 13347 | 13347 | numpy.lib.arraysetops
204
+ import time: 18662 | 18662 | numpy.lib.format
205
+ import time: 9834 | 9834 | numpy.lib._datasource
206
+ import time: 10465 | 10465 | numpy.lib._iotools
207
+ import time: 26974 | 65935 | numpy.lib.npyio
208
+ import time: 14808 | 14808 | numpy.lib.arrayterator
209
+ import time: 28751 | 28751 | numpy.lib.arraypad
210
+ import time: 31641 | 31641 | numpy.lib._version
211
+ import time: 16718 | 802213 | numpy.lib
212
+ import time: 13764 | 13764 | numpy.fft._pocketfft_internal
213
+ import time: 47189 | 60952 | numpy.fft._pocketfft
214
+ import time: 34176 | 34176 | numpy.fft.helper
215
+ import time: 57859 | 152987 | numpy.fft
216
+ import time: 32723 | 32723 | numpy.polynomial.polyutils
217
+ import time: 20810 | 20810 | numpy.polynomial._polybase
218
+ import time: 47703 | 101235 | numpy.polynomial.polynomial
219
+ import time: 22597 | 22597 | numpy.polynomial.chebyshev
220
+ import time: 15190 | 15190 | numpy.polynomial.legendre
221
+ import time: 12249 | 12249 | numpy.polynomial.hermite
222
+ import time: 15883 | 15883 | numpy.polynomial.hermite_e
223
+ import time: 20997 | 20997 | numpy.polynomial.laguerre
224
+ import time: 57756 | 245905 | numpy.polynomial
225
+ import time: 11659 | 11659 | backports_abc
226
+ import time: 8899 | 20558 | numpy.random._common
227
+ import time: 609 | 609 | binascii
228
+ import time: 1895 | 2503 | base64
229
+ import time: 6404 | 6404 | _hashlib
230
+ import time: 184 | 184 | _blake2
231
+ import time: 1554 | 1737 | hashlib
232
+ import time: 2119 | 10260 | hmac
233
+ import time: 96 | 96 | _bisect
234
+ import time: 1252 | 1347 | bisect
235
+ import time: 164 | 164 | _random
236
+ import time: 176 | 176 | _sha512
237
+ import time: 2855 | 4541 | random
238
+ import time: 1966 | 19268 | secrets
239
+ import time: 8364 | 48189 | numpy.random.bit_generator
240
+ import time: 5773 | 5773 | numpy.random._bounded_integers
241
+ import time: 6014 | 6014 | numpy.random._mt19937
242
+ import time: 9760 | 69734 | numpy.random.mtrand
243
+ import time: 7331 | 7331 | numpy.random._philox
244
+ import time: 5862 | 5862 | numpy.random._pcg64
245
+ import time: 5462 | 5462 | numpy.random._sfc64
246
+ import time: 8031 | 8031 | numpy.random._generator
247
+ import time: 22729 | 119147 | numpy.random._pickle
248
+ import time: 23124 | 142271 | numpy.random
249
+ import time: 20592 | 20592 | numpy.ctypeslib
250
+ import time: 40900 | 40900 | numpy.ma.core
251
+ import time: 26643 | 26643 | numpy.ma.extras
252
+ import time: 31513 | 99055 | numpy.ma
253
+ import time: 75854 | 2650852 | numpy
254
+ import time: 22335 | 2673187 | numpy._core
255
+ import time: 29059 | 2702245 | numpy._core._multiarray_umath
256
+ import time: 22101 | 2724346 | ml_dtypes._ml_dtypes_ext
257
+ import time: 53315 | 2777661 | ml_dtypes._finfo
258
+ import time: 15604 | 15604 | ml_dtypes._iinfo
259
+ import time: 93641 | 2886905 | ml_dtypes
260
+ import time: 62057 | 62057 | jaxlib.xla_extension
261
+ import time: 46707 | 3010102 | jaxlib.xla_client
262
+ import time: 31965 | 31965 | jaxlib.cpu
263
+ import time: 45309 | 45309 | jaxlib.cpu._lapack
264
+ import time: 28009 | 105282 | jaxlib.lapack
265
+ import time: 378 | 378 | jaxlib.cuda
266
+ import time: 468 | 846 | jaxlib.cuda._versions
267
+ import time: 40192 | 40192 | jax_cuda12_plugin
268
+ import time: 39973 | 80164 | jax_cuda12_plugin._versions
269
+ import time: 5772 | 5772 | jaxlib.plugin_support
270
+ import time: 1036132 | 1041903 | jaxlib.gpu_solver
271
+ import time: 7872 | 7872 | jaxlib.mlir
272
+ import time: 167478 | 167478 | jaxlib.mlir._mlir_libs._mlir
273
+ import time: 22966 | 190443 | jaxlib.mlir._mlir_libs
274
+ import time: 599 | 191041 | jaxlib.mlir._mlir_libs._mlir
275
+ import time: 311 | 191352 | jaxlib.mlir._mlir_libs._mlir.ir
276
+ import time: 15499 | 214723 | jaxlib.mlir.ir
277
+ import time: 4822 | 4822 | jaxlib.mlir.dialects
278
+ import time: 9680 | 9680 | jaxlib.mlir.dialects._ods_common
279
+ import time: 20956 | 30636 | jaxlib.mlir.dialects._stablehlo_ops_gen
280
+ import time: 6744 | 6744 | jaxlib.mlir._mlir_libs._stablehlo
281
+ import time: 15548 | 57749 | jaxlib.mlir.dialects.stablehlo
282
+ import time: 8550 | 66299 | jaxlib.hlo_helpers
283
+ import time: 20744 | 301764 | jaxlib.gpu_sparse
284
+ import time: 26892 | 26892 | jaxlib.gpu_prng
285
+ import time: 14945 | 14945 | jaxlib.gpu_linalg
286
+ import time: 8409 | 8409 | jaxlib.gpu_common_utils
287
+ import time: 19412 | 27821 | jaxlib.gpu_rnn
288
+ import time: 18330 | 18330 | jaxlib.gpu_triton
289
+ import time: 7304 | 7304 | jaxlib.mosaic
290
+ import time: 17191 | 24495 | jaxlib.mosaic.python
291
+ import time: 8072 | 8072 | jaxlib.mosaic.dialect
292
+ import time: 12479 | 20550 | jaxlib.mosaic.dialect.gpu
293
+ import time: 39792 | 60342 | jaxlib.mosaic.dialect.gpu._mosaic_gpu_gen_ops
294
+ import time: 33098 | 33098 | jaxlib.mosaic.dialect.gpu._mosaic_gpu_gen_enums
295
+ import time: 17182 | 17182 | jaxlib.mlir._mlir_libs._mosaic_gpu_ext
296
+ import time: 44715 | 179830 | jaxlib.mosaic.python.mosaic_gpu
297
+ import time: 38772 | 38772 | jaxlib.mosaic.python._tpu_gen
298
+ import time: 19006 | 19006 | jaxlib.mlir._mlir_libs._tpu_ext
299
+ import time: 35539 | 93316 | jaxlib.mosaic.python.tpu
300
+ import time: 52437 | 52437 | nvidia
301
+ import time: 33733 | 33733 | nvidia.cuda_nvcc
302
+ import time: 94038 | 5275093 | jax._src.lib
303
+ import time: 43893 | 43893 | jax._src.logging_config
304
+ import time: 58138 | 5399172 | jax._src.config
305
+ import time: 5142 | 5142 | glob
306
+ import time: 32810 | 37951 | jax._src.hardware_utils
307
+ import time: 89410 | 5582767 | jax._src.cloud_tpu_init
308
+ import time: 1474 | 1474 | libtpu
309
+ import time: 16083 | 16083 | jax._src.basearray
310
+ import time: 10237 | 26320 | jax._src.typing
311
+ import time: 14374 | 14374 | jax._src.util
312
+ import time: 40141 | 40141 | jax._src.traceback_util
313
+ import time: 19514 | 100348 | jax._src.dtypes
314
+ import time: 44949 | 44949 | jax._src.effects
315
+ import time: 45640 | 45640 | jax._src.compute_on
316
+ import time: 827 | 827 | _json
317
+ import time: 1678 | 2504 | json.scanner
318
+ import time: 1613 | 4117 | json.decoder
319
+ import time: 1159 | 1159 | json.encoder
320
+ import time: 1849 | 7124 | json
321
+ import time: 867 | 867 | importlib._abc
322
+ import time: 675 | 1541 | importlib.util
323
+ import time: 1466 | 3007 | pkgutil
324
+ import time: 74826 | 74826 | jax._src.clusters.cluster
325
+ import time: 44029 | 44029 | jax._src.clusters.ompi_cluster
326
+ import time: 45845 | 45845 | jax._src.clusters.slurm_cluster
327
+ import time: 664 | 664 | _socket
328
+ import time: 283 | 283 | array
329
+ import time: 6869 | 7815 | socket
330
+ import time: 46749 | 54564 | jax._src.clusters.mpi4py_cluster
331
+ import time: 52800 | 52800 | jax._src.clusters.cloud_tpu_cluster
332
+ import time: 45723 | 45723 | jax._src.clusters.k8s_cluster
333
+ import time: 98001 | 415785 | jax._src.clusters
334
+ import time: 53385 | 469170 | jax._src.distributed
335
+ import time: 83403 | 83403 | jax_plugins
336
+ import time: 60154 | 622856 | jax._src.xla_bridge
337
+ import time: 54548 | 677403 | jax._src.mesh
338
+ import time: 72161 | 72161 | jax._src.partition_spec
339
+ import time: 85965 | 85965 | jax._src.errors
340
+ import time: 208 | 208 | _heapq
341
+ import time: 9637 | 9845 | heapq
342
+ import time: 9570 | 19414 | difflib
343
+ import time: 60664 | 80078 | jax._src.tree_util
344
+ import time: 58352 | 138430 | jax._src.linear_util
345
+ import time: 49579 | 49579 | sysconfig
346
+ import time: 10829 | 10829 | _sysconfigdata__x86_64-linux-gnu
347
+ import time: 61018 | 121425 | jax._src.source_info_util
348
+ import time: 9007 | 9007 | colorama
349
+ import time: 55120 | 64126 | jax._src.pretty_printer
artifacts/twin_handover_packed_parallelization_20260309/run_logs/inspect_twin_packed_batch_handover_train.log ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config_name: pi05_twin_handover_256_packed_baseline_pytorch_2k
2
+ repo_id: lsnu/twin_handover_256_train
3
+ sample_index: 0
4
+ norm_stats_path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json
5
+ norm_stats_keys: ['actions', 'state']
6
+ norm_stats_lengths: state_mean=16 state_std=16 action_mean=16 action_std=16
7
+ block_boundaries: [0:8] [8:16] [16:24] [24:32]
8
+ raw_state_16d_shape: (16,)
9
+ raw_state_16d:
10
+ [ 7.1883e-07 1.7515e-01 -5.6890e-06 -8.7299e-01 -6.3130e-06 1.2216e+00
11
+ 7.8540e-01 1.0000e+00 1.1957e-06 1.7514e-01 -9.2062e-07 -8.7312e-01
12
+ 1.6098e-05 1.2216e+00 7.8539e-01 1.0000e+00]
13
+ raw_actions_16d_shape: (16, 16)
14
+ raw_actions_16d:
15
+ [[ 2.3842e-05 -8.2493e-04 -5.7220e-05 3.9577e-04 2.8610e-05 7.8201e-04
16
+ -1.2398e-04 1.0000e+00 9.5367e-05 4.0293e-03 9.5367e-06 7.2479e-04
17
+ 1.8120e-04 -1.4305e-05 -2.2411e-04 1.0000e+00]
18
+ [ 5.0068e-04 -1.5645e-02 2.6083e-03 -5.5575e-02 1.8883e-03 2.5430e-02
19
+ -1.9326e-02 1.0000e+00 2.7800e-02 2.4877e-02 -2.7924e-02 -2.7843e-02
20
+ -1.6832e-02 1.0629e-02 3.8543e-02 1.0000e+00]
21
+ [ 1.7738e-03 -7.6041e-02 8.9645e-03 -1.7257e-01 6.0558e-03 8.7943e-02
22
+ -6.4831e-02 1.0000e+00 9.2287e-02 5.8761e-02 -9.3136e-02 -7.6413e-02
23
+ -5.3630e-02 4.2353e-02 1.2606e-01 1.0000e+00]
24
+ [ 3.2425e-03 -1.3747e-01 1.5845e-02 -3.1527e-01 1.0653e-02 1.6477e-01
25
+ -1.1840e-01 1.0000e+00 1.7036e-01 1.0629e-01 -1.7153e-01 -1.4015e-01
26
+ -9.7461e-02 7.8468e-02 2.3009e-01 1.0000e+00]
27
+ [ 5.5885e-03 -2.1545e-01 2.4767e-02 -4.6663e-01 1.6103e-02 2.4452e-01
28
+ -1.7446e-01 1.0000e+00 2.5305e-01 1.5107e-01 -2.5392e-01 -2.1260e-01
29
+ -1.4490e-01 1.1766e-01 3.4122e-01 1.0000e+00]
30
+ [ 6.1035e-03 -2.8390e-01 3.3288e-02 -6.1909e-01 2.1739e-02 3.2683e-01
31
+ -2.3199e-01 1.0000e+00 3.3677e-01 1.9970e-01 -3.3804e-01 -2.8173e-01
32
+ -1.9161e-01 1.5831e-01 4.5282e-01 1.0000e+00]
33
+ [ 9.3937e-03 -3.1736e-01 3.8815e-02 -7.2264e-01 2.9097e-02 3.8407e-01
34
+ -2.9788e-01 1.0000e+00 3.9431e-01 2.3764e-01 -3.9650e-01 -3.2045e-01
35
+ -2.2884e-01 1.8487e-01 5.3961e-01 1.0000e+00]
36
+ [ 1.1177e-02 -3.3051e-01 4.2367e-02 -7.4072e-01 3.5295e-02 4.0234e-01
37
+ -3.4810e-01 1.0000e+00 4.1353e-01 2.4687e-01 -4.1600e-01 -3.4033e-01
38
+ -2.4390e-01 1.9067e-01 5.7513e-01 1.0000e+00]
39
+ [ 1.2674e-02 -3.1841e-01 4.3559e-02 -7.5366e-01 3.7665e-02 4.1035e-01
40
+ -3.7488e-01 1.0000e+00 4.2095e-01 2.5672e-01 -4.2238e-01 -3.4335e-01
41
+ -2.4950e-01 1.9567e-01 5.8634e-01 1.0000e+00]
42
+ [ 1.5645e-02 -3.0324e-01 4.3592e-02 -7.4167e-01 4.2624e-02 4.1367e-01
43
+ -4.1199e-01 1.0000e+00 4.2353e-01 2.6254e-01 -4.2444e-01 -3.4899e-01
44
+ -2.5064e-01 1.9762e-01 5.8977e-01 1.0000e+00]
45
+ [ 1.6398e-02 -2.9560e-01 4.2553e-02 -7.3503e-01 4.5595e-02 4.1383e-01
46
+ -4.3354e-01 1.0000e+00 4.2382e-01 2.5776e-01 -4.2612e-01 -3.5491e-01
47
+ -2.5177e-01 1.9462e-01 5.9134e-01 1.0000e+00]
48
+ [ 2.0757e-02 -2.9058e-01 4.2739e-02 -7.3133e-01 4.6840e-02 4.1339e-01
49
+ -4.5310e-01 1.0000e+00 4.2468e-01 2.5057e-01 -4.2498e-01 -3.4835e-01
50
+ -2.5149e-01 2.0029e-01 5.9138e-01 1.0000e+00]
51
+ [ 2.3303e-02 -2.7753e-01 4.1437e-02 -7.2254e-01 4.8075e-02 4.1380e-01
52
+ -4.7155e-01 1.0000e+00 4.2468e-01 2.5254e-01 -4.2522e-01 -3.4195e-01
53
+ -2.5130e-01 1.9623e-01 5.9127e-01 1.0000e+00]
54
+ [ 2.7924e-02 -2.5505e-01 4.0684e-02 -7.0069e-01 5.3768e-02 4.1076e-01
55
+ -5.1048e-01 1.0000e+00 4.2446e-01 2.5574e-01 -4.2656e-01 -3.5101e-01
56
+ -2.5181e-01 1.9645e-01 5.9101e-01 1.0000e+00]
57
+ [ 3.2401e-02 -2.4053e-01 4.1451e-02 -6.8364e-01 5.6882e-02 4.1132e-01
58
+ -5.4158e-01 1.0000e+00 4.2435e-01 2.5109e-01 -4.2632e-01 -3.5082e-01
59
+ -2.5095e-01 1.9805e-01 5.9107e-01 1.0000e+00]
60
+ [ 3.4809e-02 -2.2431e-01 4.0565e-02 -6.7288e-01 5.6076e-02 4.0839e-01
61
+ -5.6400e-01 1.0000e+00 4.2504e-01 2.5486e-01 -4.2588e-01 -3.4874e-01
62
+ -2.5139e-01 1.9783e-01 5.9183e-01 1.0000e+00]]
63
+ normalized_state_16d_shape: (16,)
64
+ normalized_state_16d:
65
+ [-0.174 0.1055 -0.0061 1.0124 0.086 -0.4741 0.2016 1.0004 0.0951
66
+ 0.0668 0.0549 1.0086 -0.053 -0.3299 -1.0068 1.0004]
67
+ normalized_actions_16d_shape: (16, 16)
68
+ normalized_actions_16d:
69
+ [[-0.2378 0.0147 0.1124 0.1989 0.1562 0.1251 0.0182 1.0004 0.1108
70
+ 0.0624 0.0823 0.9208 0.055 -0.5935 -0.7448 1.0004]
71
+ [-0.2367 -0.0063 0.1178 0.1174 0.1593 0.1567 -0.0046 1.0004 0.1686
72
+ 0.107 0.02 0.7676 0.0127 -0.5697 -0.6371 1.0004]
73
+ [-0.2338 -0.092 0.1305 -0.0529 0.1664 0.2368 -0.0585 1.0004 0.303
74
+ 0.1794 -0.1254 0.5072 -0.0788 -0.499 -0.3941 1.0004]
75
+ [-0.2306 -0.1792 0.1444 -0.2606 0.1742 0.3352 -0.1219 1.0004 0.4658
76
+ 0.2811 -0.3003 0.1655 -0.1877 -0.4185 -0.1052 1.0004]
77
+ [-0.2253 -0.2898 0.1623 -0.4809 0.1834 0.4374 -0.1883 1.0004 0.6382
78
+ 0.3768 -0.484 -0.223 -0.3056 -0.3311 0.2034 1.0004]
79
+ [-0.2242 -0.3869 0.1795 -0.7028 0.193 0.5429 -0.2564 1.0004 0.8128
80
+ 0.4808 -0.6717 -0.5936 -0.4217 -0.2404 0.5133 1.0004]
81
+ [-0.2168 -0.4344 0.1906 -0.8535 0.2055 0.6163 -0.3344 1.0004 0.9328
82
+ 0.5619 -0.8021 -0.8012 -0.5143 -0.1812 0.7543 1.0004]
83
+ [-0.2129 -0.4531 0.1977 -0.8798 0.216 0.6397 -0.3939 1.0004 0.9729
84
+ 0.5816 -0.8455 -0.9078 -0.5517 -0.1682 0.8529 1.0004]
85
+ [-0.2095 -0.4359 0.2001 -0.8986 0.2201 0.6499 -0.4256 1.0004 0.9883
86
+ 0.6027 -0.8598 -0.924 -0.5656 -0.1571 0.8841 1.0004]
87
+ [-0.2029 -0.4144 0.2002 -0.8812 0.2285 0.6542 -0.4695 1.0004 0.9937
88
+ 0.6151 -0.8644 -0.9542 -0.5684 -0.1527 0.8936 1.0004]
89
+ [-0.2012 -0.4035 0.1981 -0.8715 0.2335 0.6544 -0.495 1.0004 0.9943
90
+ 0.6049 -0.8681 -0.986 -0.5713 -0.1594 0.8979 1.0004]
91
+ [-0.1915 -0.3964 0.1985 -0.8661 0.2356 0.6538 -0.5182 1.0004 0.9961
92
+ 0.5895 -0.8656 -0.9508 -0.5705 -0.1468 0.8981 1.0004]
93
+ [-0.1858 -0.3779 0.1959 -0.8533 0.2377 0.6544 -0.54 1.0004 0.9961
94
+ 0.5937 -0.8661 -0.9165 -0.5701 -0.1558 0.8978 1.0004]
95
+ [-0.1755 -0.346 0.1944 -0.8215 0.2474 0.6505 -0.5861 1.0004 0.9956
96
+ 0.6006 -0.8691 -0.9651 -0.5713 -0.1554 0.897 1.0004]
97
+ [-0.1655 -0.3254 0.1959 -0.7967 0.2527 0.6512 -0.623 1.0004 0.9954
98
+ 0.5907 -0.8686 -0.9641 -0.5692 -0.1518 0.8972 1.0004]
99
+ [-0.1601 -0.3024 0.1941 -0.7811 0.2513 0.6474 -0.6495 1.0004 0.9969
100
+ 0.5987 -0.8676 -0.9529 -0.5703 -0.1523 0.8993 1.0004]]
101
+ packed_state_32d_shape: (32,)
102
+ packed_state_32d:
103
+ [-0.174 0.1055 -0.0061 1.0124 0.086 -0.4741 0.2016 1.0004 0.
104
+ 0. 0. 0. 0. 0. 0. 0. 0.0951 0.0668
105
+ 0.0549 1.0086 -0.053 -0.3299 -1.0068 1.0004 0. 0. 0.
106
+ 0. 0. 0. 0. 0. ]
107
+ packed_actions_32d_shape: (16, 32)
108
+ packed_actions_32d:
109
+ [[-0.2378 0.0147 0.1124 0.1989 0.1562 0.1251 0.0182 1.0004 0.
110
+ 0. 0. 0. 0. 0. 0. 0. 0.1108 0.0624
111
+ 0.0823 0.9208 0.055 -0.5935 -0.7448 1.0004 0. 0. 0.
112
+ 0. 0. 0. 0. 0. ]
113
+ [-0.2367 -0.0063 0.1178 0.1174 0.1593 0.1567 -0.0046 1.0004 0.
114
+ 0. 0. 0. 0. 0. 0. 0. 0.1686 0.107
115
+ 0.02 0.7676 0.0127 -0.5697 -0.6371 1.0004 0. 0. 0.
116
+ 0. 0. 0. 0. 0. ]
117
+ [-0.2338 -0.092 0.1305 -0.0529 0.1664 0.2368 -0.0585 1.0004 0.
118
+ 0. 0. 0. 0. 0. 0. 0. 0.303 0.1794
119
+ -0.1254 0.5072 -0.0788 -0.499 -0.3941 1.0004 0. 0. 0.
120
+ 0. 0. 0. 0. 0. ]
121
+ [-0.2306 -0.1792 0.1444 -0.2606 0.1742 0.3352 -0.1219 1.0004 0.
122
+ 0. 0. 0. 0. 0. 0. 0. 0.4658 0.2811
123
+ -0.3003 0.1655 -0.1877 -0.4185 -0.1052 1.0004 0. 0. 0.
124
+ 0. 0. 0. 0. 0. ]
125
+ [-0.2253 -0.2898 0.1623 -0.4809 0.1834 0.4374 -0.1883 1.0004 0.
126
+ 0. 0. 0. 0. 0. 0. 0. 0.6382 0.3768
127
+ -0.484 -0.223 -0.3056 -0.3311 0.2034 1.0004 0. 0. 0.
128
+ 0. 0. 0. 0. 0. ]
129
+ [-0.2242 -0.3869 0.1795 -0.7028 0.193 0.5429 -0.2564 1.0004 0.
130
+ 0. 0. 0. 0. 0. 0. 0. 0.8128 0.4808
131
+ -0.6717 -0.5936 -0.4217 -0.2404 0.5133 1.0004 0. 0. 0.
132
+ 0. 0. 0. 0. 0. ]
133
+ [-0.2168 -0.4344 0.1906 -0.8535 0.2055 0.6163 -0.3344 1.0004 0.
134
+ 0. 0. 0. 0. 0. 0. 0. 0.9328 0.5619
135
+ -0.8021 -0.8012 -0.5143 -0.1812 0.7543 1.0004 0. 0. 0.
136
+ 0. 0. 0. 0. 0. ]
137
+ [-0.2129 -0.4531 0.1977 -0.8798 0.216 0.6397 -0.3939 1.0004 0.
138
+ 0. 0. 0. 0. 0. 0. 0. 0.9729 0.5816
139
+ -0.8455 -0.9078 -0.5517 -0.1682 0.8529 1.0004 0. 0. 0.
140
+ 0. 0. 0. 0. 0. ]
141
+ [-0.2095 -0.4359 0.2001 -0.8986 0.2201 0.6499 -0.4256 1.0004 0.
142
+ 0. 0. 0. 0. 0. 0. 0. 0.9883 0.6027
143
+ -0.8598 -0.924 -0.5656 -0.1571 0.8841 1.0004 0. 0. 0.
144
+ 0. 0. 0. 0. 0. ]
145
+ [-0.2029 -0.4144 0.2002 -0.8812 0.2285 0.6542 -0.4695 1.0004 0.
146
+ 0. 0. 0. 0. 0. 0. 0. 0.9937 0.6151
147
+ -0.8644 -0.9542 -0.5684 -0.1527 0.8936 1.0004 0. 0. 0.
148
+ 0. 0. 0. 0. 0. ]
149
+ [-0.2012 -0.4035 0.1981 -0.8715 0.2335 0.6544 -0.495 1.0004 0.
150
+ 0. 0. 0. 0. 0. 0. 0. 0.9943 0.6049
151
+ -0.8681 -0.986 -0.5713 -0.1594 0.8979 1.0004 0. 0. 0.
152
+ 0. 0. 0. 0. 0. ]
153
+ [-0.1915 -0.3964 0.1985 -0.8661 0.2356 0.6538 -0.5182 1.0004 0.
154
+ 0. 0. 0. 0. 0. 0. 0. 0.9961 0.5895
155
+ -0.8656 -0.9508 -0.5705 -0.1468 0.8981 1.0004 0. 0. 0.
156
+ 0. 0. 0. 0. 0. ]
157
+ [-0.1858 -0.3779 0.1959 -0.8533 0.2377 0.6544 -0.54 1.0004 0.
158
+ 0. 0. 0. 0. 0. 0. 0. 0.9961 0.5937
159
+ -0.8661 -0.9165 -0.5701 -0.1558 0.8978 1.0004 0. 0. 0.
160
+ 0. 0. 0. 0. 0. ]
161
+ [-0.1755 -0.346 0.1944 -0.8215 0.2474 0.6505 -0.5861 1.0004 0.
162
+ 0. 0. 0. 0. 0. 0. 0. 0.9956 0.6006
163
+ -0.8691 -0.9651 -0.5713 -0.1554 0.897 1.0004 0. 0. 0.
164
+ 0. 0. 0. 0. 0. ]
165
+ [-0.1655 -0.3254 0.1959 -0.7967 0.2527 0.6512 -0.623 1.0004 0.
166
+ 0. 0. 0. 0. 0. 0. 0. 0.9954 0.5907
167
+ -0.8686 -0.9641 -0.5692 -0.1518 0.8972 1.0004 0. 0. 0.
168
+ 0. 0. 0. 0. 0. ]
169
+ [-0.1601 -0.3024 0.1941 -0.7811 0.2513 0.6474 -0.6495 1.0004 0.
170
+ 0. 0. 0. 0. 0. 0. 0. 0.9969 0.5987
171
+ -0.8676 -0.9529 -0.5703 -0.1523 0.8993 1.0004 0. 0. 0.
172
+ 0. 0. 0. 0. 0. ]]
173
+ state_padded_zero_count: 16 / 16
174
+ actions_padded_zero_count: 256 / 256
175
+ state_padded_exact_zero: True
176
+ actions_padded_exact_zero: True
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20.log ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0308 22:58:43.681000 16356 torch/distributed/run.py:766]
2
+ W0308 22:58:43.681000 16356 torch/distributed/run.py:766] *****************************************
3
+ W0308 22:58:43.681000 16356 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0308 22:58:43.681000 16356 torch/distributed/run.py:766] *****************************************
5
+ 23:00:43.715 [I] Overwriting checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16451:train_pytorch.py:451)
6
+ 23:00:43.718 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16451:train_pytorch.py:458)
7
+ 23:00:43.762 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (16451:train_pytorch.py:474)
8
+ 23:00:43.844 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (16451:config.py:234)
9
+ 23:00:43.846 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
10
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
11
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
12
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
13
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
14
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
15
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
16
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
17
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
18
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
19
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
20
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
21
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
22
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
23
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
24
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
25
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
26
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
27
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
28
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
29
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
30
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
31
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
32
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
33
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x702ed02c29d0>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (16451:data_loader.py:282)
34
+ 23:00:43.849 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (16451:data_loader.py:148)
35
+ 23:00:43.958 [I] Overwriting checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16454:train_pytorch.py:451)
36
+ 23:00:43.959 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16454:train_pytorch.py:458)
37
+ 23:00:43.959 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (16454:train_pytorch.py:474)
38
+ 23:00:44.046 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (16454:config.py:234)
39
+ 23:00:44.048 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
40
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
41
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
42
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
43
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
44
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
45
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
46
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
47
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
48
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
49
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
50
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
51
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
52
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
53
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
54
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
55
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
56
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
57
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
58
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
59
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
60
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
61
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
62
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
63
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x79acff7466d0>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (16454:data_loader.py:282)
64
+ 23:00:44.049 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (16454:data_loader.py:148)
65
+ 23:00:45.456 [I] Overwriting checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16452:train_pytorch.py:451)
66
+ 23:00:45.458 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16452:train_pytorch.py:458)
67
+ 23:00:45.458 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (16452:train_pytorch.py:474)
68
+ 23:00:45.548 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (16452:config.py:234)
69
+ 23:00:45.549 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
70
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
71
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
72
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
73
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
74
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
75
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
76
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
77
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
78
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
79
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
80
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
81
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
82
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
83
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
84
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
85
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
86
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
87
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
88
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
89
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
90
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
91
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
92
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
93
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x7736f700ba90>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (16452:data_loader.py:282)
94
+ 23:00:45.551 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (16452:data_loader.py:148)
95
+ 23:00:45.562 [I] local_batch_size: 4 (16451:data_loader.py:363)
96
+ 23:00:45.861 [I] local_batch_size: 4 (16454:data_loader.py:363)
97
+ 23:00:47.007 [I] local_batch_size: 4 (16452:data_loader.py:363)
98
+ 23:00:47.287 [I] Overwriting checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16453:train_pytorch.py:451)
99
+ 23:00:47.290 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20 (16453:train_pytorch.py:458)
100
+ 23:00:47.291 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (16453:train_pytorch.py:474)
101
+ INFO:2026-03-08 23:00:47,419:jax._src.xla_bridge:925: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
102
+ 23:00:47.419 [I] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (16454:xla_bridge.py:925)
103
+ INFO:2026-03-08 23:00:47,435:jax._src.xla_bridge:925: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
104
+ 23:00:47.435 [I] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (16454:xla_bridge.py:925)
105
+ 23:00:47.437 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (16453:config.py:234)
106
+ 23:00:47.440 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
107
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
108
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
109
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
110
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
111
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
112
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
113
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
114
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
115
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
116
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
117
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
118
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
119
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
120
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
121
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
122
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
123
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
124
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
125
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
126
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
127
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
128
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
129
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
130
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x728778855290>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (16453:data_loader.py:282)
131
+ 23:00:47.459 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (16453:data_loader.py:148)
132
+ INFO:2026-03-08 23:00:47,514:jax._src.xla_bridge:925: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
133
+ 23:00:47.514 [I] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (16451:xla_bridge.py:925)
134
+ INFO:2026-03-08 23:00:47,530:jax._src.xla_bridge:925: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
135
+ 23:00:47.530 [I] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (16451:xla_bridge.py:925)
136
+ INFO:2026-03-08 23:00:48,755:jax._src.xla_bridge:925: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
137
+ 23:00:48.755 [I] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (16452:xla_bridge.py:925)
138
+ INFO:2026-03-08 23:00:48,768:jax._src.xla_bridge:925: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
139
+ 23:00:48.768 [I] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (16452:xla_bridge.py:925)
140
+ 23:00:49.029 [I] local_batch_size: 4 (16453:data_loader.py:363)
141
+ INFO:2026-03-08 23:00:49,834:jax._src.xla_bridge:925: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
142
+ 23:00:49.834 [I] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (16453:xla_bridge.py:925)
143
+ INFO:2026-03-08 23:00:49,836:jax._src.xla_bridge:925: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
144
+ 23:00:49.836 [I] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (16453:xla_bridge.py:925)
145
+ 23:01:43.138 [I] Enabled gradient checkpointing for PI0Pytorch model (16451:pi0_pytorch.py:148)
146
+ 23:01:43.139 [I] Enabled gradient checkpointing for memory optimization (16451:train_pytorch.py:535)
147
+ 23:01:43.139 [I] Step 0 (after_model_creation): GPU memory - allocated: 7.47GB, reserved: 7.48GB, free: 0.01GB, peak_allocated: 7.47GB, peak_reserved: 7.48GB | DDP: rank=0, world_size=4 (16451:train_pytorch.py:422)
148
+ 23:01:43.801 [I] Enabled gradient checkpointing for PI0Pytorch model (16454:pi0_pytorch.py:148)
149
+ 23:01:43.802 [I] Enabled gradient checkpointing for memory optimization (16454:train_pytorch.py:535)
150
+ 23:01:44.623 [I] Enabled gradient checkpointing for PI0Pytorch model (16452:pi0_pytorch.py:148)
151
+ 23:01:44.623 [I] Enabled gradient checkpointing for memory optimization (16452:train_pytorch.py:535)
152
+ 23:01:45.354 [I] Enabled gradient checkpointing for PI0Pytorch model (16453:pi0_pytorch.py:148)
153
+ 23:01:45.354 [I] Enabled gradient checkpointing for memory optimization (16453:train_pytorch.py:535)
154
+ 23:01:46.643 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (16451:train_pytorch.py:564)
155
+ 23:01:46.648 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (16454:train_pytorch.py:564)
156
+ 23:01:46.648 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (16453:train_pytorch.py:564)
157
+ 23:01:46.648 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (16452:train_pytorch.py:564)
158
+ 23:01:48.714 [I] Weight loading missing key count: 0 (16451:train_pytorch.py:572)
159
+ 23:01:48.714 [I] Weight loading missing keys: set() (16451:train_pytorch.py:573)
160
+ 23:01:48.715 [I] Weight loading unexpected key count: 0 (16451:train_pytorch.py:574)
161
+ 23:01:48.715 [I] Weight loading unexpected keys: [] (16451:train_pytorch.py:575)
162
+ 23:01:48.715 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (16451:train_pytorch.py:576)
163
+ 23:01:48.722 [I] Running on: 9e9e564d5d6e | world_size=4 (16451:train_pytorch.py:616)
164
+ 23:01:48.722 [I] Training config: batch_size=16, effective_batch_size=4, num_train_steps=20 (16451:train_pytorch.py:617)
165
+ 23:01:48.723 [I] Memory optimizations: gradient_checkpointing=True (16451:train_pytorch.py:620)
166
+ 23:01:48.724 [I] LR schedule: warmup=200, peak_lr=2.50e-05, decay_steps=2000, end_lr=2.50e-06 (16451:train_pytorch.py:621)
167
+ 23:01:48.724 [I] Optimizer: AdamW, weight_decay=1e-10, clip_norm=1.0 (16451:train_pytorch.py:624)
168
+ 23:01:48.724 [I] EMA is not supported for PyTorch training (16451:train_pytorch.py:627)
169
+ 23:01:48.725 [I] Training precision: bfloat16 (16451:train_pytorch.py:628)
170
+ 23:01:48.733 [I] Resolved config name: pi05_twin_handover_256_packed_baseline_pytorch_2k (16451:train_pytorch.py:234)
171
+ 23:01:48.733 [I] Dataset repo_id: lsnu/twin_handover_256_train (16451:train_pytorch.py:235)
172
+ 23:01:48.733 [I] Norm-stats file path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json (16451:train_pytorch.py:236)
173
+ 23:01:48.734 [I] Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16} (16451:train_pytorch.py:237)
174
+ 23:01:48.734 [I] Checkpoint source path: /workspace/checkpoints/pi05_base_single_pytorch (16451:train_pytorch.py:238)
175
+ 23:01:48.734 [I] Model type: baseline (16451:train_pytorch.py:239)
176
+ 23:01:48.734 [I] Packed transforms active: True (16451:train_pytorch.py:240)
177
+ 23:01:48.734 [I] World size: 4 (16451:train_pytorch.py:241)
178
+ 23:01:48.735 [I] Batch size: local=4, global=16 (16451:train_pytorch.py:242)
179
+ 23:01:48.735 [I] num_workers: 8 (16451:train_pytorch.py:243)
180
+ 23:01:48.735 [I] Precision: bfloat16 (16451:train_pytorch.py:244)
181
+ 23:01:48.735 [I] LR schedule summary: warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06 (16451:train_pytorch.py:245)
182
+ 23:01:48.736 [I] Save/log intervals: save_interval=250, log_interval=10 (16451:train_pytorch.py:252)
183
+ 23:01:48.736 [I] Action-loss mask: (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) (16451:train_pytorch.py:253)
184
+ 23:01:48.736 [I] Active mask dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] (16451:train_pytorch.py:254)
185
+ 23:01:48.736 [I] Masked dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] (16451:train_pytorch.py:255)
186
+
187
+ 23:01:48.822 [I] Weight loading missing key count: 0 (16453:train_pytorch.py:572)
188
+ 23:01:48.822 [I] Weight loading missing keys: set() (16454:train_pytorch.py:573)
189
+ 23:01:48.823 [I] Weight loading missing keys: set() (16453:train_pytorch.py:573)
190
+ 23:01:48.823 [I] Weight loading unexpected key count: 0 (16454:train_pytorch.py:574)
191
+ 23:01:48.823 [I] Weight loading missing key count: 0 (16452:train_pytorch.py:572)
192
+ 23:01:48.823 [I] Weight loading unexpected key count: 0 (16453:train_pytorch.py:574)
193
+ 23:01:48.823 [I] Weight loading unexpected keys: [] (16454:train_pytorch.py:575)
194
+ 23:01:48.823 [I] Weight loading missing keys: set() (16452:train_pytorch.py:573)
195
+ 23:01:48.824 [I] Weight loading unexpected keys: [] (16453:train_pytorch.py:575)
196
+ 23:01:48.824 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (16454:train_pytorch.py:576)
197
+ 23:01:48.824 [I] Weight loading unexpected key count: 0 (16452:train_pytorch.py:574)
198
+ 23:01:48.824 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (16453:train_pytorch.py:576)
199
+ 23:01:48.825 [I] Weight loading unexpected keys: [] (16452:train_pytorch.py:575)
200
+ 23:01:48.825 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (16452:train_pytorch.py:576)
201
+ W0308 23:06:44.622000 16356 torch/distributed/elastic/agent/server/api.py:719] Received 15 death signal, shutting down workers
202
+ W0308 23:06:44.645000 16356 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 16451 closing signal SIGTERM
203
+ W0308 23:06:44.659000 16356 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 16452 closing signal SIGTERM
204
+ W0308 23:06:44.679000 16356 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 16453 closing signal SIGTERM
205
+ W0308 23:06:44.728000 16356 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 16454 closing signal SIGTERM
206
+ /usr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 14 leaked semaphore objects to clean up at shutdown
207
+ warnings.warn('resource_tracker: There appear to be %d '
208
+ /usr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 14 leaked semaphore objects to clean up at shutdown
209
+ warnings.warn('resource_tracker: There appear to be %d '
210
+ /usr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 14 leaked semaphore objects to clean up at shutdown
211
+ warnings.warn('resource_tracker: There appear to be %d '
212
+ /usr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 14 leaked semaphore objects to clean up at shutdown
213
+ warnings.warn('resource_tracker: There appear to be %d '
214
+ Traceback (most recent call last):
215
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/torchrun", line 10, in <module>
216
+ sys.exit(main())
217
+ ^^^^^^
218
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
219
+ return f(*args, **kwargs)
220
+ ^^^^^^^^^^^^^^^^^^
221
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main
222
+ run(args)
223
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run
224
+ elastic_launch(
225
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__
226
+ return launch_agent(self._config, self._entrypoint, list(args))
227
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
228
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
229
+ result = agent.run()
230
+ ^^^^^^^^^^^
231
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/metrics/api.py", line 138, in wrapper
232
+ result = f(*args, **kwargs)
233
+ ^^^^^^^^^^^^^^^^^^
234
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 711, in run
235
+ result = self._invoke_run(role)
236
+ ^^^^^^^^^^^^^^^^^^^^^^
237
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 870, in _invoke_run
238
+ time.sleep(monitor_interval)
239
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
240
+ raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
241
+ torch.distributed.elastic.multiprocessing.api.SignalException: Process 16356 got signal: 15
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20b.log ADDED
File without changes
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20d.log ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0308 23:09:45.070000 19958 torch/distributed/run.py:766]
2
+ W0308 23:09:45.070000 19958 torch/distributed/run.py:766] *****************************************
3
+ W0308 23:09:45.070000 19958 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0308 23:09:45.070000 19958 torch/distributed/run.py:766] *****************************************
5
+ W0308 23:12:25.090000 19958 torch/distributed/elastic/agent/server/api.py:719] Received 15 death signal, shutting down workers
6
+ W0308 23:12:25.147000 19958 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 20051 closing signal SIGTERM
7
+ Traceback (most recent call last):
8
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/torchrun", line 10, in <module>
9
+ sys.exit(main())
10
+ ^^^^^^
11
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
12
+ return f(*args, **kwargs)
13
+ ^^^^^^^^^^^^^^^^^^
14
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main
15
+ run(args)
16
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run
17
+ elastic_launch(
18
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__
19
+ return launch_agent(self._config, self._entrypoint, list(args))
20
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
22
+ result = agent.run()
23
+ ^^^^^^^^^^^
24
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/metrics/api.py", line 138, in wrapper
25
+ result = f(*args, **kwargs)
26
+ ^^^^^^^^^^^^^^^^^^
27
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 711, in run
28
+ result = self._invoke_run(role)
29
+ ^^^^^^^^^^^^^^^^^^^^^^
30
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 870, in _invoke_run
31
+ time.sleep(monitor_interval)
32
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
33
+ raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
34
+ torch.distributed.elastic.multiprocessing.api.SignalException: Process 19958 got signal: 15
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20e.log ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0308 23:13:16.278000 20146 torch/distributed/run.py:766]
2
+ W0308 23:13:16.278000 20146 torch/distributed/run.py:766] *****************************************
3
+ W0308 23:13:16.278000 20146 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0308 23:13:16.278000 20146 torch/distributed/run.py:766] *****************************************
5
+ W0308 23:15:58.203000 20146 torch/distributed/elastic/agent/server/api.py:719] Received 15 death signal, shutting down workers
6
+ W0308 23:15:58.263000 20146 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 20244 closing signal SIGTERM
7
+ Traceback (most recent call last):
8
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/torchrun", line 10, in <module>
9
+ sys.exit(main())
10
+ ^^^^^^
11
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
12
+ return f(*args, **kwargs)
13
+ ^^^^^^^^^^^^^^^^^^
14
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main
15
+ run(args)
16
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run
17
+ elastic_launch(
18
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__
19
+ return launch_agent(self._config, self._entrypoint, list(args))
20
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
22
+ result = agent.run()
23
+ ^^^^^^^^^^^
24
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/metrics/api.py", line 138, in wrapper
25
+ result = f(*args, **kwargs)
26
+ ^^^^^^^^^^^^^^^^^^
27
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 711, in run
28
+ result = self._invoke_run(role)
29
+ ^^^^^^^^^^^^^^^^^^^^^^
30
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 870, in _invoke_run
31
+ time.sleep(monitor_interval)
32
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
33
+ raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
34
+ torch.distributed.elastic.multiprocessing.api.SignalException: Process 20146 got signal: 15
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20k.log ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0308 23:45:59.171000 25558 torch/distributed/run.py:766]
2
+ W0308 23:45:59.171000 25558 torch/distributed/run.py:766] *****************************************
3
+ W0308 23:45:59.171000 25558 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0308 23:45:59.171000 25558 torch/distributed/run.py:766] *****************************************
5
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
6
+ warnings.warn( # warn only once
7
+ [rank1]:[W308 23:48:06.218806836 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
8
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
9
+ warnings.warn( # warn only once
10
+ [rank3]:[W308 23:48:09.583585113 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
11
+ 23:48:18.157 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20k (25643:train_pytorch.py:478)
12
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
13
+ warnings.warn( # warn only once
14
+ [rank0]:[W308 23:48:18.631390841 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
15
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
16
+ warnings.warn( # warn only once
17
+ [rank2]:[W308 23:48:20.490054230 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
18
+ 23:48:21.532 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (25643:train_pytorch.py:497)
19
+ 23:48:21.656 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (25643:config.py:234)
20
+ 23:48:21.658 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
21
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
22
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
23
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
24
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
25
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
26
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
27
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
28
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
29
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
30
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
31
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
32
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
33
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
34
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
35
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
36
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
37
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
38
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
39
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
40
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
41
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
42
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
43
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
44
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x7ded44f10710>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (25643:data_loader.py:283)
45
+ 23:48:21.665 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (25643:data_loader.py:149)
46
+ 23:48:27.988 [I] local_batch_size: 4 (25643:data_loader.py:364)
47
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
48
+ warnings.warn( # warn only once
49
+ 23:50:52.339 [I] Enabled gradient checkpointing for PI0Pytorch model (25643:pi0_pytorch.py:150)
50
+ 23:50:52.344 [I] Enabled gradient checkpointing for memory optimization (25643:train_pytorch.py:569)
51
+ 23:50:52.345 [I] Step 0 (after_model_creation): GPU memory - allocated: 7.47GB, reserved: 7.48GB, free: 0.01GB, peak_allocated: 7.47GB, peak_reserved: 7.48GB | DDP: rank=0, world_size=4 (25643:train_pytorch.py:438)
52
+ 23:51:03.555 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (25643:train_pytorch.py:598)
53
+ 23:51:05.643 [I] Weight loading missing key count: 0 (25643:train_pytorch.py:606)
54
+ 23:51:05.643 [I] Weight loading missing keys: set() (25643:train_pytorch.py:607)
55
+ 23:51:05.643 [I] Weight loading unexpected key count: 0 (25643:train_pytorch.py:608)
56
+ 23:51:05.644 [I] Weight loading unexpected keys: [] (25643:train_pytorch.py:609)
57
+ 23:51:05.644 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (25643:train_pytorch.py:610)
58
+ 23:51:05.647 [I] Running on: 9e9e564d5d6e | world_size=4 (25643:train_pytorch.py:650)
59
+ 23:51:05.648 [I] Training config: batch_size=16, effective_batch_size=4, num_train_steps=20 (25643:train_pytorch.py:651)
60
+ 23:51:05.648 [I] Memory optimizations: gradient_checkpointing=True (25643:train_pytorch.py:654)
61
+ 23:51:05.648 [I] LR schedule: warmup=200, peak_lr=2.50e-05, decay_steps=2000, end_lr=2.50e-06 (25643:train_pytorch.py:655)
62
+ 23:51:05.649 [I] Optimizer: AdamW, weight_decay=1e-10, clip_norm=1.0 (25643:train_pytorch.py:658)
63
+ 23:51:05.649 [I] EMA is not supported for PyTorch training (25643:train_pytorch.py:661)
64
+ 23:51:05.650 [I] Training precision: bfloat16 (25643:train_pytorch.py:662)
65
+ 23:51:05.671 [I] Resolved config name: pi05_twin_handover_256_packed_baseline_pytorch_2k (25643:train_pytorch.py:249)
66
+ 23:51:05.671 [I] Dataset repo_id: lsnu/twin_handover_256_train (25643:train_pytorch.py:250)
67
+ 23:51:05.672 [I] Norm-stats file path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json (25643:train_pytorch.py:251)
68
+ 23:51:05.672 [I] Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16} (25643:train_pytorch.py:252)
69
+ 23:51:05.673 [I] Checkpoint source path: /workspace/checkpoints/pi05_base_single_pytorch (25643:train_pytorch.py:253)
70
+ 23:51:05.673 [I] Model type: baseline (25643:train_pytorch.py:254)
71
+ 23:51:05.674 [I] Packed transforms active: True (25643:train_pytorch.py:255)
72
+ 23:51:05.674 [I] World size: 4 (25643:train_pytorch.py:256)
73
+ 23:51:05.674 [I] Batch size: local=4, global=16 (25643:train_pytorch.py:257)
74
+ 23:51:05.674 [I] num_workers: 8 (25643:train_pytorch.py:258)
75
+ 23:51:05.675 [I] Precision: bfloat16 (25643:train_pytorch.py:259)
76
+ 23:51:05.675 [I] LR schedule summary: warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06 (25643:train_pytorch.py:260)
77
+ 23:51:05.676 [I] Save/log intervals: save_interval=250, log_interval=10 (25643:train_pytorch.py:267)
78
+ 23:51:05.676 [I] Action-loss mask: (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) (25643:train_pytorch.py:268)
79
+ 23:51:05.676 [I] Active mask dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] (25643:train_pytorch.py:269)
80
+ 23:51:05.677 [I] Masked dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] (25643:train_pytorch.py:270)
81
+
82
+ self.pid = os.fork()
83
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
84
+ self.pid = os.fork()
85
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
86
+ self.pid = os.fork()
87
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
88
+ self.pid = os.fork()
89
+ 23:51:12.079 [I] debug_step=1 observation.state shape=(4, 32) dtype=torch.float64 actions shape=(4, 16, 32) dtype=torch.float32 (25643:train_pytorch.py:762)
90
+ 23:51:12.080 [I] debug_step=1 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (25643:train_pytorch.py:766)
91
+ 23:51:12.080 [I] debug_step=1 prompt_token_lengths=[74, 72, 76, 78] (25643:train_pytorch.py:769)
92
+ 23:51:12.080 [I] debug_step=1 state_stats min=-1.0000 max=1.0004 mean=0.0715 std=0.4362 (25643:train_pytorch.py:770)
93
+ 23:51:12.080 [I] debug_step=1 action_stats min=-1.0000 max=1.0947 mean=0.0331 std=0.4134 (25643:train_pytorch.py:773)
94
+ 23:51:12.092 [I] debug_step=1 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (25643:train_pytorch.py:776)
95
+ 23:51:12.221 [I] debug_step=1 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (25643:train_pytorch.py:780)
96
+ 23:51:12.222 [I] debug_step=1 lr=1.24e-07 grad_norm=6.6952 data_time=2.5702s step_time=3.8197s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (25643:train_pytorch.py:785)
97
+
98
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 862, in <module>
99
+ [rank3]: main()
100
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 858, in main
101
+ [rank3]: train_loop(config)
102
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 703, in train_loop
103
+ [rank3]: losses = model(observation, actions)
104
+ [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
105
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
106
+ [rank3]: return self._call_impl(*args, **kwargs)
107
+ [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
108
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
109
+ [rank3]: return forward_call(*args, **kwargs)
110
+ [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
111
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1633, in forward
112
+ [rank3]: inputs, kwargs = self._pre_forward(*inputs, **kwargs)
113
+ [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
114
+ [rank3]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1522, in _pre_forward
115
+ [rank3]: if torch.is_grad_enabled() and self.reducer._rebuild_buckets():
116
+ [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
117
+ [rank3]: RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
118
+ [rank3]: making sure all `forward` function outputs participate in calculating loss.
119
+ [rank3]: If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
120
+ [rank3]: Parameter indices which did not receive grad for rank 3: 596 597 598 599 601 602 803
121
+ [rank3]: In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error
122
+ [rank1]: Traceback (most recent call last):
123
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 862, in <module>
124
+ [rank1]: main()
125
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 858, in main
126
+ [rank1]: train_loop(config)
127
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 703, in train_loop
128
+ [rank1]: losses = model(observation, actions)
129
+ [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
130
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
131
+ [rank1]: return self._call_impl(*args, **kwargs)
132
+ [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
133
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
134
+ [rank1]: return forward_call(*args, **kwargs)
135
+ [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
136
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1633, in forward
137
+ [rank1]: inputs, kwargs = self._pre_forward(*inputs, **kwargs)
138
+ [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
139
+ [rank1]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1522, in _pre_forward
140
+ [rank1]: if torch.is_grad_enabled() and self.reducer._rebuild_buckets():
141
+ [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142
+ [rank1]: RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
143
+ [rank1]: making sure all `forward` function outputs participate in calculating loss.
144
+ [rank1]: If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
145
+ [rank1]: Parameter indices which did not receive grad for rank 1: 596 597 598 599 601 602 803
146
+ [rank1]: In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error
147
+ [rank2]: Traceback (most recent call last):
148
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 862, in <module>
149
+ [rank2]: main()
150
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 858, in main
151
+ [rank2]: train_loop(config)
152
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 703, in train_loop
153
+ [rank2]: losses = model(observation, actions)
154
+ [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
155
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
156
+ [rank2]: return self._call_impl(*args, **kwargs)
157
+ [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
158
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
159
+ [rank2]: return forward_call(*args, **kwargs)
160
+ [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
161
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1633, in forward
162
+ [rank2]: inputs, kwargs = self._pre_forward(*inputs, **kwargs)
163
+ [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
164
+ [rank2]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1522, in _pre_forward
165
+ [rank2]: if torch.is_grad_enabled() and self.reducer._rebuild_buckets():
166
+ [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
167
+ [rank2]: RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
168
+ [rank2]: making sure all `forward` function outputs participate in calculating loss.
169
+ [rank2]: If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
170
+ [rank2]: Parameter indices which did not receive grad for rank 2: 596 597 598 599 601 602 803
171
+ [rank2]: In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error
172
+ [rank0]: Traceback (most recent call last):
173
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 862, in <module>
174
+ [rank0]: main()
175
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 858, in main
176
+ [rank0]: train_loop(config)
177
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/scripts/train_pytorch.py", line 703, in train_loop
178
+ [rank0]: losses = model(observation, actions)
179
+ [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
180
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
181
+ [rank0]: return self._call_impl(*args, **kwargs)
182
+ [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
183
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
184
+ [rank0]: return forward_call(*args, **kwargs)
185
+ [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
186
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1633, in forward
187
+ [rank0]: inputs, kwargs = self._pre_forward(*inputs, **kwargs)
188
+ [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
189
+ [rank0]: File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/nn/parallel/distributed.py", line 1522, in _pre_forward
190
+ [rank0]: if torch.is_grad_enabled() and self.reducer._rebuild_buckets():
191
+ [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
192
+ [rank0]: RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
193
+ [rank0]: making sure all `forward` function outputs participate in calculating loss.
194
+ [rank0]: If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
195
+ [rank0]: Parameter indices which did not receive grad for rank 0: 596 597 598 599 601 602 803
196
+ [rank0]: In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error
197
+
198
+ [rank0]:[W308 23:51:13.598698202 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
199
+ W0308 23:51:15.249000 25558 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 25644 closing signal SIGTERM
200
+ W0308 23:51:15.305000 25558 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 25645 closing signal SIGTERM
201
+ W0308 23:51:15.328000 25558 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 25646 closing signal SIGTERM
202
+ E0308 23:51:16.314000 25558 torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: 1) local_rank: 0 (pid: 25643) of binary: /workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/python
203
+ Traceback (most recent call last):
204
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/bin/torchrun", line 10, in <module>
205
+ sys.exit(main())
206
+ ^^^^^^
207
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
208
+ return f(*args, **kwargs)
209
+ ^^^^^^^^^^^^^^^^^^
210
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main
211
+ run(args)
212
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run
213
+ elastic_launch(
214
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__
215
+ return launch_agent(self._config, self._entrypoint, list(args))
216
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
217
+ File "/workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 270, in launch_agent
218
+ raise ChildFailedError(
219
+ torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
220
+ ============================================================
221
+ scripts/train_pytorch.py FAILED
222
+ ------------------------------------------------------------
223
+ Failures:
224
+ <NO_OTHER_FAILURES>
225
+ ------------------------------------------------------------
226
+ Root Cause (first observed failure):
227
+ [0]:
228
+ time : 2026-03-08_23:51:15
229
+ host : 9e9e564d5d6e
230
+ rank : 0 (local_rank: 0)
231
+ exitcode : 1 (pid: 25643)
232
+ error_file: <N/A>
233
+ traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
234
+ ============================================================
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_baseline_20l.log ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0308 23:57:51.073000 28870 torch/distributed/run.py:766]
2
+ W0308 23:57:51.073000 28870 torch/distributed/run.py:766] *****************************************
3
+ W0308 23:57:51.073000 28870 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0308 23:57:51.073000 28870 torch/distributed/run.py:766] *****************************************
5
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
6
+ warnings.warn( # warn only once
7
+ [rank1]:[W309 00:00:38.424269437 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
8
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
9
+ warnings.warn( # warn only once
10
+ [rank2]:[W309 00:00:39.886552746 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
11
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
12
+ warnings.warn( # warn only once
13
+ [rank3]:[W309 00:00:48.235773018 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
14
+ 00:00:50.394 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20l (28954:train_pytorch.py:478)
15
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
16
+ warnings.warn( # warn only once
17
+ [rank0]:[W309 00:00:50.868996725 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
18
+ 00:00:52.168 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (28954:train_pytorch.py:497)
19
+ 00:00:52.345 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train (28954:config.py:234)
20
+ 00:00:52.350 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
21
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
22
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
23
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
24
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
25
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
26
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
27
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
28
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
29
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
30
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
31
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
32
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
33
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
34
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
35
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
36
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
37
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
38
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
39
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
40
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
41
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
42
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
43
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
44
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x7fceff4c7710>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (28954:data_loader.py:283)
45
+ 00:00:52.360 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (28954:data_loader.py:149)
46
+ 00:00:59.307 [I] local_batch_size: 4 (28954:data_loader.py:364)
47
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
48
+ warnings.warn( # warn only once
49
+ 00:02:31.673 [I] Enabled gradient checkpointing for PI0Pytorch model (28954:pi0_pytorch.py:150)
50
+ 00:02:31.680 [I] Enabled gradient checkpointing for memory optimization (28954:train_pytorch.py:569)
51
+ 00:02:31.681 [I] Step 0 (after_model_creation): GPU memory - allocated: 7.47GB, reserved: 7.48GB, free: 0.01GB, peak_allocated: 7.47GB, peak_reserved: 7.48GB | DDP: rank=0, world_size=4 (28954:train_pytorch.py:438)
52
+ 00:02:46.133 [I] Loading weights from: /workspace/checkpoints/pi05_base_single_pytorch (28954:train_pytorch.py:598)
53
+ 00:02:48.254 [I] Weight loading missing key count: 0 (28954:train_pytorch.py:606)
54
+ 00:02:48.254 [I] Weight loading missing keys: set() (28954:train_pytorch.py:607)
55
+ 00:02:48.255 [I] Weight loading unexpected key count: 0 (28954:train_pytorch.py:608)
56
+ 00:02:48.255 [I] Weight loading unexpected keys: [] (28954:train_pytorch.py:609)
57
+ 00:02:48.255 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_single_pytorch (28954:train_pytorch.py:610)
58
+ 00:02:48.259 [I] Running on: 9e9e564d5d6e | world_size=4 (28954:train_pytorch.py:650)
59
+ 00:02:48.259 [I] Training config: batch_size=16, effective_batch_size=4, num_train_steps=20 (28954:train_pytorch.py:651)
60
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
61
+ self.pid = os.fork()
62
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
63
+ self.pid = os.fork()
64
+ 00:02:48.260 [I] Memory optimizations: gradient_checkpointing=True (28954:train_pytorch.py:654)
65
+ 00:02:48.261 [I] DDP settings: find_unused_parameters=False, gradient_as_bucket_view=True, static_graph=True (28954:train_pytorch.py:655)
66
+ 00:02:48.261 [I] LR schedule: warmup=200, peak_lr=2.50e-05, decay_steps=2000, end_lr=2.50e-06 (28954:train_pytorch.py:656)
67
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
68
+ self.pid = os.fork()
69
+ 00:02:48.261 [I] Optimizer: AdamW, weight_decay=1e-10, clip_norm=1.0 (28954:train_pytorch.py:659)
70
+ 00:02:48.262 [I] EMA is not supported for PyTorch training (28954:train_pytorch.py:662)
71
+ 00:02:48.262 [I] Training precision: bfloat16 (28954:train_pytorch.py:663)
72
+ 00:02:48.266 [I] Resolved config name: pi05_twin_handover_256_packed_baseline_pytorch_2k (28954:train_pytorch.py:249)
73
+ 00:02:48.266 [I] Dataset repo_id: lsnu/twin_handover_256_train (28954:train_pytorch.py:250)
74
+ 00:02:48.266 [I] Norm-stats file path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json (28954:train_pytorch.py:251)
75
+ 00:02:48.266 [I] Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16} (28954:train_pytorch.py:252)
76
+ 00:02:48.266 [I] Checkpoint source path: /workspace/checkpoints/pi05_base_single_pytorch (28954:train_pytorch.py:253)
77
+ 00:02:48.267 [I] Model type: baseline (28954:train_pytorch.py:254)
78
+ 00:02:48.267 [I] Packed transforms active: True (28954:train_pytorch.py:255)
79
+ 00:02:48.267 [I] World size: 4 (28954:train_pytorch.py:256)
80
+ 00:02:48.267 [I] Batch size: local=4, global=16 (28954:train_pytorch.py:257)
81
+ 00:02:48.267 [I] num_workers: 8 (28954:train_pytorch.py:258)
82
+ 00:02:48.267 [I] Precision: bfloat16 (28954:train_pytorch.py:259)
83
+ 00:02:48.268 [I] LR schedule summary: warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06 (28954:train_pytorch.py:260)
84
+ 00:02:48.268 [I] Save/log intervals: save_interval=250, log_interval=10 (28954:train_pytorch.py:267)
85
+ 00:02:48.268 [I] Action-loss mask: (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) (28954:train_pytorch.py:268)
86
+ 00:02:48.268 [I] Active mask dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] (28954:train_pytorch.py:269)
87
+ 00:02:48.268 [I] Masked dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] (28954:train_pytorch.py:270)
88
+
89
+ self.pid = os.fork()
90
+ 00:02:51.626 [I] debug_step=1 observation.state shape=(4, 32) dtype=torch.float64 actions shape=(4, 16, 32) dtype=torch.float32 (28954:train_pytorch.py:763)
91
+ 00:02:51.627 [I] debug_step=1 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (28954:train_pytorch.py:767)
92
+ 00:02:51.627 [I] debug_step=1 prompt_token_lengths=[74, 72, 76, 78] (28954:train_pytorch.py:770)
93
+ 00:02:51.627 [I] debug_step=1 state_stats min=-1.0000 max=1.0004 mean=0.0715 std=0.4362 (28954:train_pytorch.py:771)
94
+ 00:02:51.627 [I] debug_step=1 action_stats min=-1.0000 max=1.0947 mean=0.0331 std=0.4134 (28954:train_pytorch.py:774)
95
+ 00:02:51.628 [I] debug_step=1 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (28954:train_pytorch.py:777)
96
+ 00:02:51.645 [I] debug_step=1 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (28954:train_pytorch.py:781)
97
+ 00:02:51.645 [I] debug_step=1 lr=1.24e-07 grad_norm=15.9656 data_time=1.1114s step_time=2.2178s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (28954:train_pytorch.py:786)
98
+
99
+ 00:02:52.155 [I] debug_step=2 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (28954:train_pytorch.py:767)
100
+ 00:02:52.156 [I] debug_step=2 prompt_token_lengths=[79, 76, 69, 69] (28954:train_pytorch.py:770)
101
+ 00:02:52.157 [I] debug_step=2 state_stats min=-1.0000 max=1.0004 mean=0.0430 std=0.4223 (28954:train_pytorch.py:771)
102
+ 00:02:52.157 [I] debug_step=2 action_stats min=-1.0000 max=1.0071 mean=0.0532 std=0.4394 (28954:train_pytorch.py:774)
103
+ 00:02:52.158 [I] debug_step=2 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (28954:train_pytorch.py:777)
104
+ 00:02:52.159 [I] debug_step=2 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (28954:train_pytorch.py:781)
105
+ 00:02:52.159 [I] debug_step=2 lr=2.49e-07 grad_norm=7.5785 data_time=0.0858s step_time=0.4435s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (28954:train_pytorch.py:786)
106
+
107
+ 00:02:52.947 [I] debug_step=3 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (28954:train_pytorch.py:767)
108
+ 00:02:52.948 [I] debug_step=3 prompt_token_lengths=[74, 68, 72, 73] (28954:train_pytorch.py:770)
109
+ 00:02:52.949 [I] debug_step=3 state_stats min=-1.1677 max=1.0004 mean=0.0099 std=0.5093 (28954:train_pytorch.py:771)
110
+ 00:02:52.949 [I] debug_step=3 action_stats min=-1.1487 max=1.1439 mean=0.0173 std=0.4079 (28954:train_pytorch.py:774)
111
+ 00:02:52.950 [I] debug_step=3 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (28954:train_pytorch.py:777)
112
+ 00:02:52.951 [I] debug_step=3 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (28954:train_pytorch.py:781)
113
+ 00:02:52.951 [I] debug_step=3 lr=3.73e-07 grad_norm=10.5944 data_time=0.1892s step_time=0.6031s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (28954:train_pytorch.py:786)
114
+
115
+ 00:02:53.749 [I] debug_step=4 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (28954:train_pytorch.py:767)
116
+ 00:02:53.750 [I] debug_step=4 prompt_token_lengths=[75, 73, 76, 71] (28954:train_pytorch.py:770)
117
+ 00:02:53.750 [I] debug_step=4 state_stats min=-1.0000 max=1.0708 mean=0.0711 std=0.4551 (28954:train_pytorch.py:771)
118
+ 00:02:53.750 [I] debug_step=4 action_stats min=-1.0000 max=1.4460 mean=0.0674 std=0.4311 (28954:train_pytorch.py:774)
119
+ 00:02:53.751 [I] debug_step=4 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (28954:train_pytorch.py:777)
120
+ 00:02:53.752 [I] debug_step=4 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (28954:train_pytorch.py:781)
121
+ 00:02:53.752 [I] debug_step=4 lr=4.98e-07 grad_norm=13.1086 data_time=0.1977s step_time=0.6039s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (28954:train_pytorch.py:786)
122
+
123
+ 00:02:54.234 [I] debug_step=5 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (28954:train_pytorch.py:767)
124
+ 00:02:54.234 [I] debug_step=5 prompt_token_lengths=[73, 75, 70, 73] (28954:train_pytorch.py:770)
125
+ 00:02:54.234 [I] debug_step=5 state_stats min=-1.0000 max=1.0004 mean=0.0188 std=0.4734 (28954:train_pytorch.py:771)
126
+ 00:02:54.235 [I] debug_step=5 action_stats min=-1.0000 max=1.0647 mean=0.0147 std=0.3985 (28954:train_pytorch.py:774)
127
+ 00:02:54.235 [I] debug_step=5 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (28954:train_pytorch.py:777)
128
+ 00:02:54.235 [I] debug_step=5 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (28954:train_pytorch.py:781)
129
+ 00:02:54.236 [I] debug_step=5 lr=6.22e-07 grad_norm=21.4053 data_time=0.0611s step_time=0.4238s gpu_mem_allocated=28.49GB gpu_mem_reserved=35.24GB gpu_mem_max_allocated=35.23GB gpu_mem_max_reserved=35.24GB (28954:train_pytorch.py:786)
130
+
131
+
132
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
133
+ warnings.warn( # warn only once
134
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
135
+ warnings.warn( # warn only once
136
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
137
+ warnings.warn( # warn only once
138
+ 00:04:31.529 [I] Saved checkpoint at step 20 -> /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/smoke_handover_packed_baseline_20l/20 (28954:train_pytorch.py:323)
139
+
140
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
141
+ warnings.warn( # warn only once
artifacts/twin_handover_packed_parallelization_20260309/run_logs/smoke_handover_packed_parallel_20a.log ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ W0309 00:05:58.586000 31870 torch/distributed/run.py:766]
2
+ W0309 00:05:58.586000 31870 torch/distributed/run.py:766] *****************************************
3
+ W0309 00:05:58.586000 31870 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
4
+ W0309 00:05:58.586000 31870 torch/distributed/run.py:766] *****************************************
5
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
6
+ warnings.warn( # warn only once
7
+ [rank3]:[W309 00:07:35.438460211 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
8
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
9
+ warnings.warn( # warn only once
10
+ [rank2]:[W309 00:07:38.377129614 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
11
+ 00:07:39.654 [I] Created experiment checkpoint directory: /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/smoke_handover_packed_parallel_20a (31952:train_pytorch.py:478)
12
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
13
+ warnings.warn( # warn only once
14
+ [rank0]:[W309 00:07:39.073712842 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
15
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
16
+ warnings.warn( # warn only once
17
+ [rank1]:[W309 00:07:43.016127248 ProcessGroupNCCL.cpp:4718] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 as device used by this process is currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. You can pecify device_id in init_process_group() to force use of a particular device.
18
+ 00:07:45.272 [I] Using batch size per GPU: 4 (total batch size across 4 GPUs: 16) (31952:train_pytorch.py:497)
19
+ 00:07:45.376 [I] Loaded norm stats from /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train (31952:config.py:234)
20
+ 00:07:45.378 [I] data_config: DataConfig(repo_id='lsnu/twin_handover_256_train', asset_id='lsnu/twin_handover_256_train', norm_stats={'state': NormStats(mean=array([ 0.40321857, 0.17899239, -0.07588876, -2.06326795, -0.46418607,
21
+ 1.79356563, 0.70229131, 0.48194093, 0.93952829, 0.86693275,
22
+ -1.03168762, -1.9056077 , -0.53421056, 1.87584054, 2.36738205,
23
+ 0.91249251]), std=array([0.73344636, 0.47653052, 0.72710407, 0.42399687, 0.63613892,
24
+ 0.61144608, 1.11724186, 0.49967375, 0.86981195, 0.75071597,
25
+ 0.90787333, 0.35008711, 0.51183224, 0.36600712, 0.56947577,
26
+ 0.28257725]), q01=array([-1.52408956, -1.32446341, -1.91092197, -2.89885788, -1.66315554,
27
+ 0.59010215, -2.27611645, 0. , -1.77352981, -1.62131719,
28
+ -1.77092851, -2.19172778, -2.03159353, 0.55409113, 0.79255736,
29
+ 0. ]), q99=array([ 2.16638614, 1.38857444, 1.93436338, -0.88548369, 1.39976143,
30
+ 2.99162304, 2.8194857 , 0.9998 , 1.46557211, 1.74660106,
31
+ 1.58644652, -0.87876934, 2.25910752, 2.54628449, 2.89347284,
32
+ 0.9998 ])), 'actions': NormStats(mean=array([ 0.05879939, -0.00704042, -0.02719213, -0.07685276, -0.07520971,
33
+ -0.00498583, 0.03577602, 0.48164892, 0.06564316, 0.06023132,
34
+ -0.10068271, -0.09547432, -0.0526481 , 0.08205888, 0.13954687,
35
+ 0.88333535]), std=array([0.18337056, 0.28128958, 0.18525195, 0.29767084, 0.22944973,
36
+ 0.40312037, 0.3896611 , 0.49966311, 0.21938531, 0.16883859,
37
+ 0.20206179, 0.14864719, 0.12629333, 0.15546791, 0.23423795,
38
+ 0.32102022]), q01=array([-0.34140511, -0.71597991, -0.55301429, -0.8233152 , -0.68097536,
39
+ -0.87723451, -0.86000918, 0. , -0.53261366, -0.49289397,
40
+ -0.48524564, -0.35752607, -0.42426748, -0.18230745, -0.09212705,
41
+ 0. ]), q99=array([0.55444025, 0.69361174, 0.44115428, 0.550829 , 0.49707318,
42
+ 0.68353445, 0.82907713, 0.9998 , 0.42654409, 0.44255511,
43
+ 0.4114292 , 0.01550327, 0.38038206, 0.71452535, 0.62808441,
44
+ 0.9998 ]))}, repack_transforms=Group(inputs=[RepackTransform(structure={'images': {'cam_high': 'front_image', 'cam_left_wrist': 'wrist_left_image', 'cam_right_wrist': 'wrist_right_image'}, 'state': 'state', 'actions': 'action', 'prompt': 'task'})], outputs=()), data_transforms=Group(inputs=[AlohaInputs(adapt_to_pi=False)], outputs=[]), model_transforms=Group(inputs=[InjectDefaultPrompt(prompt=None), ResizeImages(height=224, width=224), TokenizePrompt(tokenizer=<openpi.models.tokenizer.PaligemmaTokenizer object at 0x70ac18e479d0>, discrete_state_input=True), PackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))], outputs=[UnpackPerArmBlocks(real_arm_dims=(8, 8), block_dims=(16, 16))]), use_quantile_norm=True, action_sequence_keys=('action',), prompt_from_task=False, rlds_data_dir=None, action_space=None, datasets=()) (31952:data_loader.py:283)
45
+ 00:07:45.381 [I] Using existing local LeRobot dataset mirror for lsnu/twin_handover_256_train: /workspace/lerobot/lsnu/twin_handover_256_train (31952:data_loader.py:149)
46
+ 00:07:51.404 [I] local_batch_size: 4 (31952:data_loader.py:364)
47
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
48
+ warnings.warn( # warn only once
49
+ 00:09:48.120 [I] Enabled gradient checkpointing for PI0Pytorch model (31952:pi0_pytorch.py:150)
50
+ 00:09:48.121 [I] Enabled gradient checkpointing for memory optimization (31952:train_pytorch.py:569)
51
+ 00:09:48.122 [I] Step 0 (after_model_creation): GPU memory - allocated: 7.48GB, reserved: 7.48GB, free: 0.00GB, peak_allocated: 7.48GB, peak_reserved: 7.48GB | DDP: rank=0, world_size=4 (31952:train_pytorch.py:438)
52
+ 00:10:05.891 [I] Loading weights from: /workspace/checkpoints/pi05_base_parallel_packed_from_single (31952:train_pytorch.py:598)
53
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
54
+ self.pid = os.fork()
55
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
56
+ self.pid = os.fork()
57
+ 00:12:47.760 [I] Weight loading missing key count: 0 (31952:train_pytorch.py:606)
58
+ 00:12:47.761 [I] Weight loading missing keys: set() (31952:train_pytorch.py:607)
59
+ 00:12:47.761 [I] Weight loading unexpected key count: 0 (31952:train_pytorch.py:608)
60
+ 00:12:47.761 [I] Weight loading unexpected keys: [] (31952:train_pytorch.py:609)
61
+ 00:12:47.762 [I] Loaded PyTorch weights from /workspace/checkpoints/pi05_base_parallel_packed_from_single (31952:train_pytorch.py:610)
62
+ 00:12:47.766 [I] Running on: 9e9e564d5d6e | world_size=4 (31952:train_pytorch.py:650)
63
+ 00:12:47.766 [I] Training config: batch_size=16, effective_batch_size=4, num_train_steps=20 (31952:train_pytorch.py:651)
64
+ 00:12:47.766 [I] Memory optimizations: gradient_checkpointing=True (31952:train_pytorch.py:654)
65
+ 00:12:47.766 [I] DDP settings: find_unused_parameters=False, gradient_as_bucket_view=True, static_graph=True (31952:train_pytorch.py:655)
66
+ 00:12:47.767 [I] LR schedule: warmup=200, peak_lr=2.50e-05, decay_steps=2000, end_lr=2.50e-06 (31952:train_pytorch.py:656)
67
+ 00:12:47.767 [I] Optimizer: AdamW, weight_decay=1e-10, clip_norm=1.0 (31952:train_pytorch.py:659)
68
+ 00:12:47.767 [I] EMA is not supported for PyTorch training (31952:train_pytorch.py:662)
69
+ 00:12:47.767 [I] Training precision: bfloat16 (31952:train_pytorch.py:663)
70
+ 00:12:47.771 [I] Resolved config name: pi05_twin_handover_256_packed_parallel_pytorch_2k (31952:train_pytorch.py:249)
71
+ 00:12:47.771 [I] Dataset repo_id: lsnu/twin_handover_256_train (31952:train_pytorch.py:250)
72
+ 00:12:47.771 [I] Norm-stats file path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_parallel_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json (31952:train_pytorch.py:251)
73
+ 00:12:47.771 [I] Norm-stats summary: {'keys': ['actions', 'state'], 'state_mean_len': 16, 'state_std_len': 16, 'actions_mean_len': 16, 'actions_std_len': 16} (31952:train_pytorch.py:252)
74
+ 00:12:47.771 [I] Checkpoint source path: /workspace/checkpoints/pi05_base_parallel_packed_from_single (31952:train_pytorch.py:253)
75
+ 00:12:47.771 [I] Model type: parallel (31952:train_pytorch.py:254)
76
+ 00:12:47.771 [I] Packed transforms active: True (31952:train_pytorch.py:255)
77
+ 00:12:47.772 [I] World size: 4 (31952:train_pytorch.py:256)
78
+ 00:12:47.772 [I] Batch size: local=4, global=16 (31952:train_pytorch.py:257)
79
+ 00:12:47.772 [I] num_workers: 8 (31952:train_pytorch.py:258)
80
+ 00:12:47.772 [I] Precision: bfloat16 (31952:train_pytorch.py:259)
81
+ 00:12:47.772 [I] LR schedule summary: warmup_steps=200, peak_lr=2.50e-05, decay_steps=2000, decay_lr=2.50e-06 (31952:train_pytorch.py:260)
82
+ 00:12:47.772 [I] Save/log intervals: save_interval=250, log_interval=10 (31952:train_pytorch.py:267)
83
+ 00:12:47.772 [I] Action-loss mask: (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) (31952:train_pytorch.py:268)
84
+ 00:12:47.772 [I] Active mask dims: [0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] (31952:train_pytorch.py:269)
85
+ 00:12:47.772 [I] Masked dims: [8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] (31952:train_pytorch.py:270)
86
+
87
+ self.pid = os.fork()
88
+ /usr/lib/python3.11/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
89
+ self.pid = os.fork()
90
+ 00:12:51.535 [I] debug_step=1 observation.state shape=(4, 32) dtype=torch.float64 actions shape=(4, 16, 32) dtype=torch.float32 (31952:train_pytorch.py:763)
91
+ 00:12:51.536 [I] debug_step=1 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (31952:train_pytorch.py:767)
92
+ 00:12:51.536 [I] debug_step=1 prompt_token_lengths=[74, 72, 76, 78] (31952:train_pytorch.py:770)
93
+ 00:12:51.536 [I] debug_step=1 state_stats min=-1.0000 max=1.0004 mean=0.0715 std=0.4362 (31952:train_pytorch.py:771)
94
+ 00:12:51.536 [I] debug_step=1 action_stats min=-1.0000 max=1.0947 mean=0.0331 std=0.4134 (31952:train_pytorch.py:774)
95
+ 00:12:51.537 [I] debug_step=1 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (31952:train_pytorch.py:777)
96
+ 00:12:51.560 [I] debug_step=1 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (31952:train_pytorch.py:781)
97
+ 00:12:51.560 [I] debug_step=1 lr=1.24e-07 grad_norm=16.1250 data_time=1.1500s step_time=2.5752s gpu_mem_allocated=28.53GB gpu_mem_reserved=35.28GB gpu_mem_max_allocated=35.27GB gpu_mem_max_reserved=35.28GB (31952:train_pytorch.py:786)
98
+
99
+ 00:12:52.214 [I] debug_step=2 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (31952:train_pytorch.py:767)
100
+ 00:12:52.214 [I] debug_step=2 prompt_token_lengths=[79, 76, 69, 69] (31952:train_pytorch.py:770)
101
+ 00:12:52.214 [I] debug_step=2 state_stats min=-1.0000 max=1.0004 mean=0.0430 std=0.4223 (31952:train_pytorch.py:771)
102
+ 00:12:52.215 [I] debug_step=2 action_stats min=-1.0000 max=1.0071 mean=0.0532 std=0.4394 (31952:train_pytorch.py:774)
103
+ 00:12:52.215 [I] debug_step=2 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (31952:train_pytorch.py:777)
104
+ 00:12:52.216 [I] debug_step=2 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (31952:train_pytorch.py:781)
105
+ 00:12:52.216 [I] debug_step=2 lr=2.49e-07 grad_norm=7.6422 data_time=0.1756s step_time=0.5095s gpu_mem_allocated=28.53GB gpu_mem_reserved=35.28GB gpu_mem_max_allocated=35.27GB gpu_mem_max_reserved=35.28GB (31952:train_pytorch.py:786)
106
+
107
+ 00:12:52.866 [I] debug_step=3 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (31952:train_pytorch.py:767)
108
+ 00:12:52.867 [I] debug_step=3 prompt_token_lengths=[74, 68, 72, 73] (31952:train_pytorch.py:770)
109
+ 00:12:52.868 [I] debug_step=3 state_stats min=-1.1677 max=1.0004 mean=0.0099 std=0.5093 (31952:train_pytorch.py:771)
110
+ 00:12:52.868 [I] debug_step=3 action_stats min=-1.1487 max=1.1439 mean=0.0173 std=0.4079 (31952:train_pytorch.py:774)
111
+ 00:12:52.870 [I] debug_step=3 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (31952:train_pytorch.py:777)
112
+ 00:12:52.871 [I] debug_step=3 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (31952:train_pytorch.py:781)
113
+ 00:12:52.871 [I] debug_step=3 lr=3.73e-07 grad_norm=10.7104 data_time=0.1504s step_time=0.5022s gpu_mem_allocated=28.53GB gpu_mem_reserved=35.28GB gpu_mem_max_allocated=35.27GB gpu_mem_max_reserved=35.28GB (31952:train_pytorch.py:786)
114
+
115
+ 00:12:53.506 [I] debug_step=4 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (31952:train_pytorch.py:767)
116
+ 00:12:53.507 [I] debug_step=4 prompt_token_lengths=[75, 73, 76, 71] (31952:train_pytorch.py:770)
117
+ 00:12:53.507 [I] debug_step=4 state_stats min=-1.0000 max=1.0708 mean=0.0711 std=0.4551 (31952:train_pytorch.py:771)
118
+ 00:12:53.507 [I] debug_step=4 action_stats min=-1.0000 max=1.4460 mean=0.0674 std=0.4311 (31952:train_pytorch.py:774)
119
+ 00:12:53.508 [I] debug_step=4 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (31952:train_pytorch.py:777)
120
+ 00:12:53.509 [I] debug_step=4 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (31952:train_pytorch.py:781)
121
+ 00:12:53.509 [I] debug_step=4 lr=4.98e-07 grad_norm=13.2371 data_time=0.1376s step_time=0.5020s gpu_mem_allocated=28.53GB gpu_mem_reserved=35.28GB gpu_mem_max_allocated=35.27GB gpu_mem_max_reserved=35.28GB (31952:train_pytorch.py:786)
122
+
123
+ 00:12:54.201 [I] debug_step=5 image_keys=['base_0_rgb', 'left_wrist_0_rgb', 'right_wrist_0_rgb'] image_shapes={'base_0_rgb': (4, 3, 224, 224), 'left_wrist_0_rgb': (4, 3, 224, 224), 'right_wrist_0_rgb': (4, 3, 224, 224)} (31952:train_pytorch.py:767)
124
+ 00:12:54.202 [I] debug_step=5 prompt_token_lengths=[73, 75, 70, 73] (31952:train_pytorch.py:770)
125
+ 00:12:54.203 [I] debug_step=5 state_stats min=-1.0000 max=1.0004 mean=0.0188 std=0.4734 (31952:train_pytorch.py:771)
126
+ 00:12:54.203 [I] debug_step=5 action_stats min=-1.0000 max=1.0647 mean=0.0147 std=0.3985 (31952:train_pytorch.py:774)
127
+ 00:12:54.203 [I] debug_step=5 state_nonzero_counts_8d_blocks=[32, 0, 32, 0] action_nonzero_counts_8d_blocks=[512, 0, 512, 0] (31952:train_pytorch.py:777)
128
+ 00:12:54.204 [I] debug_step=5 masked_dims=[8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31] active_dims=[0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23] masked_zero_counts state=64 actions=1024 (31952:train_pytorch.py:781)
129
+ 00:12:54.204 [I] debug_step=5 lr=6.22e-07 grad_norm=21.7693 data_time=0.1479s step_time=0.5475s gpu_mem_allocated=28.53GB gpu_mem_reserved=35.28GB gpu_mem_max_allocated=35.27GB gpu_mem_max_reserved=35.28GB (31952:train_pytorch.py:786)
130
+
131
+
132
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
133
+ warnings.warn( # warn only once
134
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
135
+ warnings.warn( # warn only once
136
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
137
+ warnings.warn( # warn only once
138
+ 00:14:36.586 [I] Saved checkpoint at step 20 -> /workspace/pi05tests-openpi-multiarm/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/smoke_handover_packed_parallel_20a/20 (31952:train_pytorch.py:323)
139
+
140
+ /workspace/pi05tests-openpi-multiarm/openpi/.venv/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:4631: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
141
+ warnings.warn( # warn only once
artifacts/twin_handover_packed_parallelization_20260309/run_logs/twin_handover_followup.log ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2026-03-09 00:31:32 UTC] follow-up runner started
2
+ [2026-03-09 00:31:32 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
3
+ [2026-03-09 00:32:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
4
+ [2026-03-09 00:33:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
5
+ [2026-03-09 00:34:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
6
+ [2026-03-09 00:35:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
7
+ [2026-03-09 00:36:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
8
+ [2026-03-09 00:37:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
9
+ [2026-03-09 00:38:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
10
+ [2026-03-09 00:39:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
11
+ [2026-03-09 00:40:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
12
+ [2026-03-09 00:41:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
13
+ [2026-03-09 00:42:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
14
+ [2026-03-09 00:43:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
15
+ [2026-03-09 00:44:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
16
+ [2026-03-09 00:45:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
17
+ [2026-03-09 00:46:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
18
+ [2026-03-09 00:47:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
19
+ [2026-03-09 00:48:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
20
+ [2026-03-09 00:49:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
21
+ [2026-03-09 00:50:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
22
+ [2026-03-09 00:51:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
23
+ [2026-03-09 00:52:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
24
+ [2026-03-09 00:53:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
25
+ [2026-03-09 00:54:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
26
+ [2026-03-09 00:55:33 UTC] waiting for processes matching: scripts/train_pytorch.py pi05_twin_handover_256_packed_baseline_pytorch_2k --exp_name handover_packed_baseline_2k
27
+ [2026-03-09 00:56:33 UTC] eval start config=pi05_twin_handover_256_packed_baseline_pytorch_2k ckpt=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/1000 batches=50
28
+ [2026-03-09 01:01:47 UTC] eval done log=/workspace/run_logs/handover_packed_baseline_2k_val_1000.log
29
+ [2026-03-09 01:01:47 UTC] eval start config=pi05_twin_handover_256_packed_baseline_pytorch_2k ckpt=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_baseline_pytorch_2k/handover_packed_baseline_2k/2000 batches=100
30
+ [2026-03-09 01:07:06 UTC] eval done log=/workspace/run_logs/handover_packed_baseline_2k_val_2000.log
31
+ [2026-03-09 01:07:06 UTC] launching parallel run
32
+ [2026-03-09 01:42:23 UTC] parallel run finished
33
+ [2026-03-09 01:42:23 UTC] eval start config=pi05_twin_handover_256_packed_parallel_pytorch_2k ckpt=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/1000 batches=50
34
+ [2026-03-09 01:45:46 UTC] eval done log=/workspace/run_logs/handover_packed_parallel_2k_val_1000.log
35
+ [2026-03-09 01:45:46 UTC] eval start config=pi05_twin_handover_256_packed_parallel_pytorch_2k ckpt=/workspace/openpi/checkpoints/pi05_twin_handover_256_packed_parallel_pytorch_2k/handover_packed_parallel_2k/2000 batches=100
36
+ [2026-03-09 01:49:19 UTC] eval done log=/workspace/run_logs/handover_packed_parallel_2k_val_2000.log
37
+ [2026-03-09 01:49:19 UTC] follow-up runner finished
artifacts/twin_handover_packed_parallelization_20260309/sanity_checks/inspect_twin_packed_batch_handover_train.log ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config_name: pi05_twin_handover_256_packed_baseline_pytorch_2k
2
+ repo_id: lsnu/twin_handover_256_train
3
+ sample_index: 0
4
+ norm_stats_path: /workspace/pi05tests-openpi-multiarm/openpi/assets/pi05_twin_handover_256_packed_baseline_pytorch_2k/lsnu/twin_handover_256_train/norm_stats.json
5
+ norm_stats_keys: ['actions', 'state']
6
+ norm_stats_lengths: state_mean=16 state_std=16 action_mean=16 action_std=16
7
+ block_boundaries: [0:8] [8:16] [16:24] [24:32]
8
+ raw_state_16d_shape: (16,)
9
+ raw_state_16d:
10
+ [ 7.1883e-07 1.7515e-01 -5.6890e-06 -8.7299e-01 -6.3130e-06 1.2216e+00
11
+ 7.8540e-01 1.0000e+00 1.1957e-06 1.7514e-01 -9.2062e-07 -8.7312e-01
12
+ 1.6098e-05 1.2216e+00 7.8539e-01 1.0000e+00]
13
+ raw_actions_16d_shape: (16, 16)
14
+ raw_actions_16d:
15
+ [[ 2.3842e-05 -8.2493e-04 -5.7220e-05 3.9577e-04 2.8610e-05 7.8201e-04
16
+ -1.2398e-04 1.0000e+00 9.5367e-05 4.0293e-03 9.5367e-06 7.2479e-04
17
+ 1.8120e-04 -1.4305e-05 -2.2411e-04 1.0000e+00]
18
+ [ 5.0068e-04 -1.5645e-02 2.6083e-03 -5.5575e-02 1.8883e-03 2.5430e-02
19
+ -1.9326e-02 1.0000e+00 2.7800e-02 2.4877e-02 -2.7924e-02 -2.7843e-02
20
+ -1.6832e-02 1.0629e-02 3.8543e-02 1.0000e+00]
21
+ [ 1.7738e-03 -7.6041e-02 8.9645e-03 -1.7257e-01 6.0558e-03 8.7943e-02
22
+ -6.4831e-02 1.0000e+00 9.2287e-02 5.8761e-02 -9.3136e-02 -7.6413e-02
23
+ -5.3630e-02 4.2353e-02 1.2606e-01 1.0000e+00]
24
+ [ 3.2425e-03 -1.3747e-01 1.5845e-02 -3.1527e-01 1.0653e-02 1.6477e-01
25
+ -1.1840e-01 1.0000e+00 1.7036e-01 1.0629e-01 -1.7153e-01 -1.4015e-01
26
+ -9.7461e-02 7.8468e-02 2.3009e-01 1.0000e+00]
27
+ [ 5.5885e-03 -2.1545e-01 2.4767e-02 -4.6663e-01 1.6103e-02 2.4452e-01
28
+ -1.7446e-01 1.0000e+00 2.5305e-01 1.5107e-01 -2.5392e-01 -2.1260e-01
29
+ -1.4490e-01 1.1766e-01 3.4122e-01 1.0000e+00]
30
+ [ 6.1035e-03 -2.8390e-01 3.3288e-02 -6.1909e-01 2.1739e-02 3.2683e-01
31
+ -2.3199e-01 1.0000e+00 3.3677e-01 1.9970e-01 -3.3804e-01 -2.8173e-01
32
+ -1.9161e-01 1.5831e-01 4.5282e-01 1.0000e+00]
33
+ [ 9.3937e-03 -3.1736e-01 3.8815e-02 -7.2264e-01 2.9097e-02 3.8407e-01
34
+ -2.9788e-01 1.0000e+00 3.9431e-01 2.3764e-01 -3.9650e-01 -3.2045e-01
35
+ -2.2884e-01 1.8487e-01 5.3961e-01 1.0000e+00]
36
+ [ 1.1177e-02 -3.3051e-01 4.2367e-02 -7.4072e-01 3.5295e-02 4.0234e-01
37
+ -3.4810e-01 1.0000e+00 4.1353e-01 2.4687e-01 -4.1600e-01 -3.4033e-01
38
+ -2.4390e-01 1.9067e-01 5.7513e-01 1.0000e+00]
39
+ [ 1.2674e-02 -3.1841e-01 4.3559e-02 -7.5366e-01 3.7665e-02 4.1035e-01
40
+ -3.7488e-01 1.0000e+00 4.2095e-01 2.5672e-01 -4.2238e-01 -3.4335e-01
41
+ -2.4950e-01 1.9567e-01 5.8634e-01 1.0000e+00]
42
+ [ 1.5645e-02 -3.0324e-01 4.3592e-02 -7.4167e-01 4.2624e-02 4.1367e-01
43
+ -4.1199e-01 1.0000e+00 4.2353e-01 2.6254e-01 -4.2444e-01 -3.4899e-01
44
+ -2.5064e-01 1.9762e-01 5.8977e-01 1.0000e+00]
45
+ [ 1.6398e-02 -2.9560e-01 4.2553e-02 -7.3503e-01 4.5595e-02 4.1383e-01
46
+ -4.3354e-01 1.0000e+00 4.2382e-01 2.5776e-01 -4.2612e-01 -3.5491e-01
47
+ -2.5177e-01 1.9462e-01 5.9134e-01 1.0000e+00]
48
+ [ 2.0757e-02 -2.9058e-01 4.2739e-02 -7.3133e-01 4.6840e-02 4.1339e-01
49
+ -4.5310e-01 1.0000e+00 4.2468e-01 2.5057e-01 -4.2498e-01 -3.4835e-01
50
+ -2.5149e-01 2.0029e-01 5.9138e-01 1.0000e+00]
51
+ [ 2.3303e-02 -2.7753e-01 4.1437e-02 -7.2254e-01 4.8075e-02 4.1380e-01
52
+ -4.7155e-01 1.0000e+00 4.2468e-01 2.5254e-01 -4.2522e-01 -3.4195e-01
53
+ -2.5130e-01 1.9623e-01 5.9127e-01 1.0000e+00]
54
+ [ 2.7924e-02 -2.5505e-01 4.0684e-02 -7.0069e-01 5.3768e-02 4.1076e-01
55
+ -5.1048e-01 1.0000e+00 4.2446e-01 2.5574e-01 -4.2656e-01 -3.5101e-01
56
+ -2.5181e-01 1.9645e-01 5.9101e-01 1.0000e+00]
57
+ [ 3.2401e-02 -2.4053e-01 4.1451e-02 -6.8364e-01 5.6882e-02 4.1132e-01
58
+ -5.4158e-01 1.0000e+00 4.2435e-01 2.5109e-01 -4.2632e-01 -3.5082e-01
59
+ -2.5095e-01 1.9805e-01 5.9107e-01 1.0000e+00]
60
+ [ 3.4809e-02 -2.2431e-01 4.0565e-02 -6.7288e-01 5.6076e-02 4.0839e-01
61
+ -5.6400e-01 1.0000e+00 4.2504e-01 2.5486e-01 -4.2588e-01 -3.4874e-01
62
+ -2.5139e-01 1.9783e-01 5.9183e-01 1.0000e+00]]
63
+ normalized_state_16d_shape: (16,)
64
+ normalized_state_16d:
65
+ [-0.174 0.1055 -0.0061 1.0124 0.086 -0.4741 0.2016 1.0004 0.0951
66
+ 0.0668 0.0549 1.0086 -0.053 -0.3299 -1.0068 1.0004]
67
+ normalized_actions_16d_shape: (16, 16)
68
+ normalized_actions_16d:
69
+ [[-0.2378 0.0147 0.1124 0.1989 0.1562 0.1251 0.0182 1.0004 0.1108
70
+ 0.0624 0.0823 0.9208 0.055 -0.5935 -0.7448 1.0004]
71
+ [-0.2367 -0.0063 0.1178 0.1174 0.1593 0.1567 -0.0046 1.0004 0.1686
72
+ 0.107 0.02 0.7676 0.0127 -0.5697 -0.6371 1.0004]
73
+ [-0.2338 -0.092 0.1305 -0.0529 0.1664 0.2368 -0.0585 1.0004 0.303
74
+ 0.1794 -0.1254 0.5072 -0.0788 -0.499 -0.3941 1.0004]
75
+ [-0.2306 -0.1792 0.1444 -0.2606 0.1742 0.3352 -0.1219 1.0004 0.4658
76
+ 0.2811 -0.3003 0.1655 -0.1877 -0.4185 -0.1052 1.0004]
77
+ [-0.2253 -0.2898 0.1623 -0.4809 0.1834 0.4374 -0.1883 1.0004 0.6382
78
+ 0.3768 -0.484 -0.223 -0.3056 -0.3311 0.2034 1.0004]
79
+ [-0.2242 -0.3869 0.1795 -0.7028 0.193 0.5429 -0.2564 1.0004 0.8128
80
+ 0.4808 -0.6717 -0.5936 -0.4217 -0.2404 0.5133 1.0004]
81
+ [-0.2168 -0.4344 0.1906 -0.8535 0.2055 0.6163 -0.3344 1.0004 0.9328
82
+ 0.5619 -0.8021 -0.8012 -0.5143 -0.1812 0.7543 1.0004]
83
+ [-0.2129 -0.4531 0.1977 -0.8798 0.216 0.6397 -0.3939 1.0004 0.9729
84
+ 0.5816 -0.8455 -0.9078 -0.5517 -0.1682 0.8529 1.0004]
85
+ [-0.2095 -0.4359 0.2001 -0.8986 0.2201 0.6499 -0.4256 1.0004 0.9883
86
+ 0.6027 -0.8598 -0.924 -0.5656 -0.1571 0.8841 1.0004]
87
+ [-0.2029 -0.4144 0.2002 -0.8812 0.2285 0.6542 -0.4695 1.0004 0.9937
88
+ 0.6151 -0.8644 -0.9542 -0.5684 -0.1527 0.8936 1.0004]
89
+ [-0.2012 -0.4035 0.1981 -0.8715 0.2335 0.6544 -0.495 1.0004 0.9943
90
+ 0.6049 -0.8681 -0.986 -0.5713 -0.1594 0.8979 1.0004]
91
+ [-0.1915 -0.3964 0.1985 -0.8661 0.2356 0.6538 -0.5182 1.0004 0.9961
92
+ 0.5895 -0.8656 -0.9508 -0.5705 -0.1468 0.8981 1.0004]
93
+ [-0.1858 -0.3779 0.1959 -0.8533 0.2377 0.6544 -0.54 1.0004 0.9961
94
+ 0.5937 -0.8661 -0.9165 -0.5701 -0.1558 0.8978 1.0004]
95
+ [-0.1755 -0.346 0.1944 -0.8215 0.2474 0.6505 -0.5861 1.0004 0.9956
96
+ 0.6006 -0.8691 -0.9651 -0.5713 -0.1554 0.897 1.0004]
97
+ [-0.1655 -0.3254 0.1959 -0.7967 0.2527 0.6512 -0.623 1.0004 0.9954
98
+ 0.5907 -0.8686 -0.9641 -0.5692 -0.1518 0.8972 1.0004]
99
+ [-0.1601 -0.3024 0.1941 -0.7811 0.2513 0.6474 -0.6495 1.0004 0.9969
100
+ 0.5987 -0.8676 -0.9529 -0.5703 -0.1523 0.8993 1.0004]]
101
+ packed_state_32d_shape: (32,)
102
+ packed_state_32d:
103
+ [-0.174 0.1055 -0.0061 1.0124 0.086 -0.4741 0.2016 1.0004 0.
104
+ 0. 0. 0. 0. 0. 0. 0. 0.0951 0.0668
105
+ 0.0549 1.0086 -0.053 -0.3299 -1.0068 1.0004 0. 0. 0.
106
+ 0. 0. 0. 0. 0. ]
107
+ packed_actions_32d_shape: (16, 32)
108
+ packed_actions_32d:
109
+ [[-0.2378 0.0147 0.1124 0.1989 0.1562 0.1251 0.0182 1.0004 0.
110
+ 0. 0. 0. 0. 0. 0. 0. 0.1108 0.0624
111
+ 0.0823 0.9208 0.055 -0.5935 -0.7448 1.0004 0. 0. 0.
112
+ 0. 0. 0. 0. 0. ]
113
+ [-0.2367 -0.0063 0.1178 0.1174 0.1593 0.1567 -0.0046 1.0004 0.
114
+ 0. 0. 0. 0. 0. 0. 0. 0.1686 0.107
115
+ 0.02 0.7676 0.0127 -0.5697 -0.6371 1.0004 0. 0. 0.
116
+ 0. 0. 0. 0. 0. ]
117
+ [-0.2338 -0.092 0.1305 -0.0529 0.1664 0.2368 -0.0585 1.0004 0.
118
+ 0. 0. 0. 0. 0. 0. 0. 0.303 0.1794
119
+ -0.1254 0.5072 -0.0788 -0.499 -0.3941 1.0004 0. 0. 0.
120
+ 0. 0. 0. 0. 0. ]
121
+ [-0.2306 -0.1792 0.1444 -0.2606 0.1742 0.3352 -0.1219 1.0004 0.
122
+ 0. 0. 0. 0. 0. 0. 0. 0.4658 0.2811
123
+ -0.3003 0.1655 -0.1877 -0.4185 -0.1052 1.0004 0. 0. 0.
124
+ 0. 0. 0. 0. 0. ]
125
+ [-0.2253 -0.2898 0.1623 -0.4809 0.1834 0.4374 -0.1883 1.0004 0.
126
+ 0. 0. 0. 0. 0. 0. 0. 0.6382 0.3768
127
+ -0.484 -0.223 -0.3056 -0.3311 0.2034 1.0004 0. 0. 0.
128
+ 0. 0. 0. 0. 0. ]
129
+ [-0.2242 -0.3869 0.1795 -0.7028 0.193 0.5429 -0.2564 1.0004 0.
130
+ 0. 0. 0. 0. 0. 0. 0. 0.8128 0.4808
131
+ -0.6717 -0.5936 -0.4217 -0.2404 0.5133 1.0004 0. 0. 0.
132
+ 0. 0. 0. 0. 0. ]
133
+ [-0.2168 -0.4344 0.1906 -0.8535 0.2055 0.6163 -0.3344 1.0004 0.
134
+ 0. 0. 0. 0. 0. 0. 0. 0.9328 0.5619
135
+ -0.8021 -0.8012 -0.5143 -0.1812 0.7543 1.0004 0. 0. 0.
136
+ 0. 0. 0. 0. 0. ]
137
+ [-0.2129 -0.4531 0.1977 -0.8798 0.216 0.6397 -0.3939 1.0004 0.
138
+ 0. 0. 0. 0. 0. 0. 0. 0.9729 0.5816
139
+ -0.8455 -0.9078 -0.5517 -0.1682 0.8529 1.0004 0. 0. 0.
140
+ 0. 0. 0. 0. 0. ]
141
+ [-0.2095 -0.4359 0.2001 -0.8986 0.2201 0.6499 -0.4256 1.0004 0.
142
+ 0. 0. 0. 0. 0. 0. 0. 0.9883 0.6027
143
+ -0.8598 -0.924 -0.5656 -0.1571 0.8841 1.0004 0. 0. 0.
144
+ 0. 0. 0. 0. 0. ]
145
+ [-0.2029 -0.4144 0.2002 -0.8812 0.2285 0.6542 -0.4695 1.0004 0.
146
+ 0. 0. 0. 0. 0. 0. 0. 0.9937 0.6151
147
+ -0.8644 -0.9542 -0.5684 -0.1527 0.8936 1.0004 0. 0. 0.
148
+ 0. 0. 0. 0. 0. ]
149
+ [-0.2012 -0.4035 0.1981 -0.8715 0.2335 0.6544 -0.495 1.0004 0.
150
+ 0. 0. 0. 0. 0. 0. 0. 0.9943 0.6049
151
+ -0.8681 -0.986 -0.5713 -0.1594 0.8979 1.0004 0. 0. 0.
152
+ 0. 0. 0. 0. 0. ]
153
+ [-0.1915 -0.3964 0.1985 -0.8661 0.2356 0.6538 -0.5182 1.0004 0.
154
+ 0. 0. 0. 0. 0. 0. 0. 0.9961 0.5895
155
+ -0.8656 -0.9508 -0.5705 -0.1468 0.8981 1.0004 0. 0. 0.
156
+ 0. 0. 0. 0. 0. ]
157
+ [-0.1858 -0.3779 0.1959 -0.8533 0.2377 0.6544 -0.54 1.0004 0.
158
+ 0. 0. 0. 0. 0. 0. 0. 0.9961 0.5937
159
+ -0.8661 -0.9165 -0.5701 -0.1558 0.8978 1.0004 0. 0. 0.
160
+ 0. 0. 0. 0. 0. ]
161
+ [-0.1755 -0.346 0.1944 -0.8215 0.2474 0.6505 -0.5861 1.0004 0.
162
+ 0. 0. 0. 0. 0. 0. 0. 0.9956 0.6006
163
+ -0.8691 -0.9651 -0.5713 -0.1554 0.897 1.0004 0. 0. 0.
164
+ 0. 0. 0. 0. 0. ]
165
+ [-0.1655 -0.3254 0.1959 -0.7967 0.2527 0.6512 -0.623 1.0004 0.
166
+ 0. 0. 0. 0. 0. 0. 0. 0.9954 0.5907
167
+ -0.8686 -0.9641 -0.5692 -0.1518 0.8972 1.0004 0. 0. 0.
168
+ 0. 0. 0. 0. 0. ]
169
+ [-0.1601 -0.3024 0.1941 -0.7811 0.2513 0.6474 -0.6495 1.0004 0.
170
+ 0. 0. 0. 0. 0. 0. 0. 0.9969 0.5987
171
+ -0.8676 -0.9529 -0.5703 -0.1523 0.8993 1.0004 0. 0. 0.
172
+ 0. 0. 0. 0. 0. ]]
173
+ state_padded_zero_count: 16 / 16
174
+ actions_padded_zero_count: 256 / 256
175
+ state_padded_exact_zero: True
176
+ actions_padded_exact_zero: True