| # Track4D-360 camera-control checkpoint backup β 2026-04-23 / extended 2026-04-28 |
|
|
| Tagged copies of every benchmarked Track4D-360 camera-control checkpoint. |
| The 1.3B family (4 files, ~2.4 GB each) was the original |
| 2026-04-23 backup for the CLIP-F/V benchmark |
| (`doc/track4d-360/2026-04-23-clipfv-4models-288x512-benchmark.md`); the 14B |
| savefix step-2000 file was added 2026-04-28 once the multi-node |
| FSDP-savefix run (`doc/track4d-360/bugs/2026-04-25-multinode-training-desync-fixes.md`) |
| produced its first verified-correct checkpoint. |
|
|
| Filenames are renamed to carry the variant tag so the source of truth is |
| legible without walking back through the training dirs. |
|
|
| **Upload destination:** all files in this directory get pushed to |
| [`yslan/track4d_360`](https://huggingface.co/yslan/track4d_360/tree/main) |
| on HuggingFace, preserving the exact filename. Filename = HF file path. |
|
|
| | backup file | size | source | training @ | |
| |---|---|---|---| |
| | `warped_step-13000.safetensors` | 2.2 GB | `warped_appearance_concat_proj_mixed_real_synth_144x256x49_1p3b_2gpu/train/Wan2.1-T2V-1.3B_track4d360_warped_appearance_concat_proj_mixed_real_synth/step-13000.safetensors` | 144x256 (Lyra-2 latent-fuse new architecture) | |
| | `static13k_step-13500.safetensors` | 2.4 GB | `hybrid_dense_plucker_mixed_real_synth_concat_project_trainplucker_attnffn_cond_dropout_144x256x49_1p3b_2gpu/train/.../step-13500.safetensors` | 144x256 (old-arch, static-only, no syn4d) | |
| | `dynamic5k_step-5000.safetensors` | 2.4 GB | `hybrid_dense_plucker_mixed_real_synth_syn4d_concat_project_trainplucker_attnffn_cond_dropout_144x256x49_1p3b_2gpu/train/.../step-5000.safetensors` | 144x256 (old-arch, + syn4d dynamic) | |
| | `ismb288_3k_step-3000.safetensors` | 2.4 GB | `ismb_hybrid_dense_plucker_mixed_real_synth_syn4d_recam_syncam_cond_dropout_288x512x49_1p3b_16gpu/train/.../step-3000.safetensors` | **288x512 (native)** (old-arch, + syn4d + RecamMaster + SynCamMaster, Isambard-trained, synced 2026-04-23) | |
| | `14b_savefix_step-2000.safetensors` | **22.9 GB** | `ismb_hybrid_dense_plucker_mixed_real_synth_syn4d_recam_syncam_savefix_cond_dropout_144x256x49_14b_16gpu_fsdp/train/Wan2.1-Fun-14B_track4d360_..._savefix_cond_dropout/step-2000.safetensors` | **144x256 14B** (Wan-Fun 14B FSDP, post-savefix bug fix, 4-node Isambard, copied 2026-04-28). Train launcher: [`bash_scripts/track4d_360/ismb/14b/sbatch/ismb_sbatch_14b_4node_144x256_fsdp_noise_commtuned_savefix_resume8800.sh`](/scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun/bash_scripts/track4d_360/ismb/14b/sbatch/ismb_sbatch_14b_4node_144x256_fsdp_noise_commtuned_savefix_resume8800.sh) | |
|
|
| Size delta within the 1.3B family: `warped` is 2.2 GB vs 2.4 GB for the |
| others. This is the architectural difference β `warped` has |
| `dense_in_channels=2` (geometry-only dense) vs `5` (RGB+geom) for the |
| old-arch models, and a different `track_injection_mode` with different |
| trainable subgraphs. |
|
|
| The 14B file's 22.9 GB is bf16 weights for the full 14B Wan-Fun DiT plus |
| the trainable Track4D-360 adapter modules (track_adapter, track_block_injector, |
| dense_target_control_encoder, plucker control_adapter) β see the |
| `[Eval] DiT checkpoint load` line printed by the eval script for the |
| exact key inventory. |
| |
| Backup method: `cp` (1.3B family was rsync 2026-04-23; 14B was a single |
| `cp` 2026-04-28). Verified by byte-size match against sources. |
| |
| --- |
| |
| ## Reproducibility bundles (dataset subsets, for sharing) |
| |
| Some benchmarks need a tiny slice of the full dataset roots. Those bundles |
| live alongside the checkpoints so the whole "checkpoints + data + scripts" |
| tree can be tarred together for a collaborator. |
| |
| | bundle dir | size | bench it reproduces | |
| |---|---|---| |
| | `bench_ismb288_multiframe_repro/` | ~52 GB | `benchmark_ismb288_3k_multiframe_vs_zbuffer_288x512.sh` (5 datasets Γ 5 scenes Γ 10 trajectories @ 288Γ512). See [`bench_ismb288_multiframe_repro/README.md`](bench_ismb288_multiframe_repro/README.md) for layout, run instructions, and what's deliberately NOT included (Wan base + VXF source). Generated by [`prepare_repro_data_ismb288_multiframe.sh`](/scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun/bash_scripts/track4d_360/plucker/benchmark/prepare_repro_data_ismb288_multiframe.sh). | |
|
|
| To create an archive for upload (data + scripts only β checkpoint is already |
| on HF as `yslan/track4d_360/ismb288_3k_step-3000.safetensors`, no need to |
| re-bundle it). The bundle is mostly PNG/EXR/safetensors β already compressed |
| content, so gzip is slow and gains almost nothing. Recommended: plain `.tar`. |
|
|
| ```bash |
| cd /scratch/shared/beegfs/yushi/logs/track4d-360/backup |
| |
| # Recommended β plain tar, fast (just streams bytes; PNG/EXR don't compress): |
| tar -cf bench_ismb288_multiframe_repro.tar bench_ismb288_multiframe_repro |
| |
| # Alternative if you prefer .tar.gz format β use parallel gzip: |
| # tar -c bench_ismb288_multiframe_repro | pigz -p $(nproc) > bench_ismb288_multiframe_repro.tar.gz |
| |
| # Alternative β tar + zstd (best size/speed for HF if both sides have zstd): |
| # tar --use-compress-program='zstd -T0 -3' -cf bench_ismb288_multiframe_repro.tar.zst bench_ismb288_multiframe_repro |
| |
| # Avoid: tar -czf ... β single-threaded gzip on 52 GB, ~hours, near-zero gain. |
| |
| # After verifying the archive is good (and ideally after uploading to HF), |
| # the unzipped tree is redundant β drop it to reclaim 52 GB: |
| rm -rf bench_ismb288_multiframe_repro |
| |
| # Re-creating later is cheap (rsync -a, ~52 GB read from beegfs sources): |
| bash /scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun/bash_scripts/track4d_360/plucker/benchmark/prepare_repro_data_ismb288_multiframe.sh |
| ``` |
|
|
| --- |
|
|
| ## How to eval |
|
|
| All 4 checkpoints evaluate through the SAME master benchmark script |
| (under `VideoX-Fun/` repo root). It takes care of per-variant arg dispatch |
| so you don't have to think about the flags listed below. |
|
|
| ### Master benchmark (all 4 models, 5 datasets Γ 5 scenes Γ 10 trajectories @ 288x512x49) |
|
|
| ```bash |
| cd /scratch/shared/beegfs/yushi/Repo/geo4d_360/VideoX-Fun |
| bash bash_scripts/track4d_360/plucker/benchmark_clipfv_4models_288x512_2gpu.sh |
| ``` |
|
|
| Options: |
|
|
| ```bash |
| # one variant only: |
| MODEL=warped bash bash_scripts/track4d_360/plucker/benchmark_clipfv_4models_288x512_2gpu.sh |
| MODEL=static13k bash ... |
| MODEL=dynamic5k bash ... |
| MODEL=ismb288_3k bash ... |
| |
| # custom GPU pair (defaults GPU0=0 GPU1=1): |
| GPU0=4 GPU1=5 bash ... |
| |
| # point at backup ckpts instead of the live train dirs (example override): |
| CKPT_WARPED=/scratch/shared/beegfs/yushi/logs/track4d-360/backup/warped_step-13000.safetensors \ |
| bash ... |
| ``` |
|
|
| The script writes to |
| `/scratch/shared/beegfs/yushi/logs/track4d-360/benchmark/clipfv_4models_288x512_2gpu/`, |
| with `summary.md` aggregated by |
| `python -m track4d_360.tools.aggregate_clip_benchmark`. |
|
|
| Already-completed trajectories auto-skip on re-run (`pred_rgb.mp4` existence |
| check in each novel-traj script). |
|
|
| ### Single-dataset / ad-hoc invocations |
|
|
| If you just want to run one checkpoint against one dataset, the benchmark script |
| dispatches to these three eval entrypoints (all under `examples/wan2.1_fun/`): |
|
|
| | dataset | script | |
| |---|---| |
| | mvs_synth, dl3dv, re10k | `eval_track4d360_hybrid_dense_static_scene_novel_traj.py` | |
| | kubric | `eval_track4d360_hybrid_dense_kubric_novel_traj.py` | |
| | syn4d | `eval_track4d360_hybrid_dense_syn4d_novel_traj.py` | |
|
|
| All three use `track4d_360.shared_args` as of 2026-04-23 β so they accept the |
| full warped + plucker + dense + track flag set. **Exact per-variant flags are |
| the table below β do not forget them: `build_eval_pipeline` reads |
| `warped_condition_mode` via `getattr(..., "off")`, so an omitted flag on a |
| warped checkpoint silently runs the model in the wrong architecture.** |
|
|
| ### Per-variant eval-flag recipe |
|
|
| Must match training β silent mismatches are the #1 source of wrong results. |
| See `doc/track4d-360/2026-04-23-clipfv-4models-288x512-benchmark.md` Β§2 bug log. |
|
|
| ``` |
| warped_step-13000.safetensors: |
| --track_injection_mode single |
| --warped_condition_mode latent_fuse |
| --warped_appearance_fusion concat_proj |
| --warped_geom_only_dense |
| --dense_in_channels 2 |
| |
| static13k_step-13500.safetensors |
| dynamic5k_step-5000.safetensors |
| ismb288_3k_step-3000.safetensors: |
| --track_injection_mode per_block |
| --track_injection_block_mode concat_project |
| --warped_condition_mode off |
| --dense_in_channels 5 |
| ``` |
|
|
| Shared across all 4: |
|
|
| ``` |
| --use_plucker_camera_control |
| --enable_v2v_plucker_camera_control |
| --use_query_frame_impulse_condition |
| --use_dense_branch |
| --dense_proj_dim 32 |
| --dense_num_residual_blocks 2 |
| --dense_alpha_track 1.0 |
| --track_config config/track4d_360/default_conv3d_patchify_srcdepth.yaml |
| --num_inference_steps 50 |
| --cfg_scale 1.0 |
| --sigma_shift 5.0 |
| --seed 42 |
| ``` |
|
|
| And the base-DiT init path is always `weights/wan21-1p3b/diffusion_pytorch_model.safetensors` |
| via `--vxf_init_checkpoint` (CLAUDE.md load-order Invariant A/B β VXF init must |
| run BEFORE LoRA wrap and is required on both scratch and resume paths). |
|
|