File size: 2,748 Bytes
93ae344
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
license: other
pipeline_tag: other
tags:
- 3d-tracking
- video-understanding
- 4d-reconstruction
- computer-vision
---

# Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Track4World is a feedforward model for efficient holistic 3D tracking of every pixel in a world-centric coordinate system from a monocular video. Built on a global 3D scene representation, Track4World applies a novel 3D correlation scheme to simultaneously estimate the pixel-wise 2D and 3D dense flow between arbitrary frame pairs.

*   **Paper:** [Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels](https://huggingface.co/papers/2603.02573)
*   **Project Page:** [jiah-cloud.github.io/Track4World](https://jiah-cloud.github.io/Track4World.github.io/)
*   **Repository:** [GitHub Repository](https://github.com/TencentARC/Track4World)

---

### 🖼️ Framework

Track4World estimates dense 3D scene flow of every pixel between arbitrary frame pairs from a monocular video in a global feedforward manner, enabling efficient and dense 3D tracking of every pixel in the world-centric coordinate system.

---

## ⚙️ Setup and Installation

```bash
# Clone the repository with submodules
git clone --recursive https://github.com/TencentARC/Track4World.git
cd Track4World

# Create and activate environment
conda create -n track4world python=3.11
conda activate track4world

# Install PyTorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121

# Install dependencies
pip install -r requirements.txt
```

Please refer to the [official GitHub README](https://github.com/TencentARC/Track4World) for detailed instructions on installing third-party modules and downloading weights.

---

## 🚀 Sample Usage

You can perform tracking and reconstruction on the provided demo video using the following commands:

### First Frame 3D Tracking (`3d_ff`)

```bash
python demo.py \
    --mp4_path demo_data/cat.mp4 \
    --mode 3d_ff \
    --Ts -1 \
    --save_base_dir results/cat
```

### Dense Tracking: Every Pixel, Every Frame (`3d_efep`)

```bash
python demo.py \
    --mp4_path demo_data/cat.mp4 \
    --coordinate world_depthanythingv3 \
    --mode 3d_efep \
    --Ts -1 \
    --ckpt_init checkpoints/track4world_da3.pth \
    --save_base_dir results/cat
```

---

## Citation

If you find Track4World useful for your research, please cite:

```bibtex
@article{lu2026track4world,
  title   = {Track4World: Feedforward World-Centric Dense 3D Tracking of All Pixels},
  author  = {Jiahao Lu and Jiayi Xu and Wenbo Hu and Ruijie Zhu and Chengfeng Zhao and Sai-Kit Yeung and Ying Shan and Yuan Liu},
  journal = {arXiv preprint arXiv:2603.02573},
  year    = {2026}
}
```