pipeline_tag: image-to-3d
license: cc-by-nc-sa-4.0
JOintGS: Joint Optimization of Cameras, Bodies and 3D Gaussians for In-the-Wild Monocular Reconstruction
This repository contains the pre-trained checkpoints for JOintGS, a unified framework that jointly optimizes camera extrinsics, human poses, and 3D Gaussian representations for robust, animatable 3D human avatar reconstruction from monocular video with coarse initialization.
- Paper: JOintGS: Joint Optimization of Cameras, Bodies and 3D Gaussians for In-the-Wild Monocular Reconstruction
- Repository: https://github.com/MiliLab/JOintGS
Introduction
Reconstructing high-fidelity animatable 3D human avatars from monocular RGB videos remains challenging, particularly in unconstrained in-the-wild scenarios where camera parameters and human poses from off-the-shelf methods (e.g., COLMAP, HMR2.0) are often inaccurate.
JOintGS enables a synergistic refinement mechanism where explicit foreground-background disentanglement allows mutual reinforcement: static background Gaussians anchor camera estimation via multi-view consistency; refined cameras improve human body alignment through accurate temporal correspondence; and optimized human poses enhance scene reconstruction by removing dynamic artifacts from static constraints.
License
- Source Code: The software in the associated GitHub repository is licensed under the MIT License.
- Model Weights: The pre-trained checkpoints in this repository are released under the CC BY-NC-SA 4.0 License.
Citation
@article{jointgs2026,
title={JOintGS: Joint Optimization of Cameras, Bodies and 3D Gaussians for In-the-Wild Monocular Reconstruction},
author={Jiuhai Chen and others},
journal={arXiv preprint arXiv:2602.04317},
year={2026}
}