VideoLDCM

VideoLDCM is the sparse-depth completion and refinement model used for ViGeo data refinement. It takes an RGB image sequence and sparse depth maps, then runs MoGe, Poisson completion, and VideoLDCM refinement through the videoldcm.infer interface.

The checkpoint in this repository is videoldcm.pt.

Installation

conda create -n vigeo python=3.10 -y
conda activate vigeo

pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu126
pip install xformers==0.0.31 --index-url https://download.pytorch.org/whl/cu126

git clone https://github.com/aigc3d/ViGeo.git
cd ViGeo
pip install -r requirements.txt
pip install -r requirements_refine.txt
pip install -e .

Quick Start

import torch

from videoldcm import videoldcm
from utils import load_depth_sequence, load_image_sequence

image_paths = ["path/to/image_000.png", "path/to/image_001.png"]
sparse_depth_paths = ["path/to/sparse_depth_000.npy", "path/to/sparse_depth_001.npy"]
device = torch.device("cuda")

image = load_image_sequence(image_paths).to(device)                # [S, 3, H, W]
sparse_depth = load_depth_sequence(sparse_depth_paths).to(device)  # [S, 1, H, W]

completion_model = videoldcm.from_pretrained("pkqbajng/VideoLDCM").eval().to(device)

with torch.inference_mode():
    output = completion_model.infer(image=image, sparse_depth=sparse_depth)

refined_depth = output["depth_pred"]  # [S, 1, H, W]
points = output["points_pred"]        # [S, H, W, 3]
confidence = output["conf_pred"]      # [S, 1, H, W]

infer does not run the sparse-depth mismatch filter. For the explicit data refinement pipeline with mismatch filtering and Poisson completion, see the ViGeo main branch README.

Links

License

Apache License 2.0.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for pkqbajng/VideoLDCM