Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction
Paper • 2604.08542 • Published
Scal3R is a framework for large-scale 3D scene reconstruction from long video sequences. It introduces a novel neural global context representation that efficiently compresses and retains long-range scene information, enabling the model to leverage extensive contextual cues for enhanced reconstruction accuracy and consistency.
[Paper] [Project Page] [GitHub]
Use the automated installation script provided in the repository:
bash scripts/install.sh
Download the required checkpoints to data/checkpoints/:
mkdir -p data/checkpoints
hf download xbillowy/Scal3R scal3r.pt --repo-type model --local-dir data/checkpoints
curl -L https://github.com/serizba/salad/releases/download/v1.0.0/dino_salad.ckpt -o data/checkpoints/dino_salad.ckpt
Run inference on a folder of images:
python -m scal3r.run --input_dir /path/to/images
You can also set an explicit tag or output directory:
python -m scal3r.run \
--input_dir /path/to/images \
--tag demo \
--output_dir data/result/custom/demo
For more details on arguments like --block_size and --overlap_size, please refer to the GitHub repository.
@misc{xie2026scal3rscalabletesttimetraining,
title={Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction},
author={Tao Xie and Peishan Yang and Yudong Jin and Yingfeng Cai and Wei Yin and Weiqiang Ren and Qian Zhang and Wei Hua and Sida Peng and Xiaoyang Guo and Xiaowei Zhou},
year={2026},
eprint={2604.08542},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.08542},
}