aist-cvrt
/

lam3c-roomtours

Feature Extraction

self-supervised-learning

point-transformer-v3

computer-vision

Model card Files Files and versions

yamada-aistar commited on Mar 12

Commit

8fa80ce

·

verified ·

1 Parent(s): 59405f9

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -20,8 +20,8 @@ language:
 LAM3C is a self-supervised learning method trained on video-generated point clouds reconstructed from unlabeled indoor walkthrough videos. This repository provides pretrained Point Transformerv3 (PTv3) backbones for feature extraction and downstream 3D scene understanding.
 > [!IMPORTANT]
-> - **LAM3C is not a raw-video model.** The released checkpoints take **point clouds** as input, not videos.
-> - The expected per-point input is 9D**: **XYZ + RGB + normals.
 > - The backbone checkpoints are feature extractors. They do not include a task-specific segmentation head unless explicitly stated.
 **arXiv:** [3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds (CVPR 2026)](https://arxiv.org/abs/2512.23042)
@@ -29,7 +29,7 @@ LAM3C is a self-supervised learning method trained on video-generated point clou
 ## What makes LAM3C different?
-Most 3D self-supervised learning methods rely on real 3D scans, which are expensive to collect at scale. LAM3C instead learns from **RoomTours**, a large collection of point clouds reconstructed from unlabeled room-tour videos gathered from the web.
 The method combines:
 - **RoomTours**, a scalable VGPC pre-training dataset

 LAM3C is a self-supervised learning method trained on video-generated point clouds reconstructed from unlabeled indoor walkthrough videos. This repository provides pretrained Point Transformerv3 (PTv3) backbones for feature extraction and downstream 3D scene understanding.
 > [!IMPORTANT]
+> - LAM3C is not a raw-video model. The released checkpoints take point clouds as input, not videos.
+> - The expected per-point input is 9D: XYZ + RGB + normals.
 > - The backbone checkpoints are feature extractors. They do not include a task-specific segmentation head unless explicitly stated.
 **arXiv:** [3D sans 3D Scans: Scalable Pre-training from Video-Generated Point Clouds (CVPR 2026)](https://arxiv.org/abs/2512.23042)
 ## What makes LAM3C different?
+Most 3D self-supervised learning methods rely on real 3D scans, which are expensive to collect at scale. LAM3C instead learns from RoomTours, a large collection of point clouds reconstructed from unlabeled room-tour videos gathered from the web.
 The method combines:
 - **RoomTours**, a scalable VGPC pre-training dataset