File size: 4,038 Bytes
f717af8
14b3bea
 
 
 
 
 
 
 
f717af8
 
14b3bea
 
 
 
 
 
f717af8
14b3bea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fcf4908
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14b3bea
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
license: mit
tags:
- slam
- monocular-slam
- visual-localization
- geometric-foundation-models
- pytorch
thumbnail: ./teaser.png
---

<p align="center">
  <h1 align="center">LeanGate</h1>
  <p align="center">
    Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
  </p>
</p>

<p align="center">
  <a href="https://lean-gate.github.io/">Project Page</a> |
  <a href="https://github.com/XinmiaoShawn/LeanGate-code">Code</a> |
  <a href="#">Paper Coming Soon</a>
</p>

<p align="center">
  <img src="./teaser.png" width="95%">
</p>

## Overview

LeanGate is a lightweight frame-gating model for transformer-based monocular SLAM.
It predicts the geometric utility of an incoming frame before expensive dense reconstruction,
allowing the system to skip redundant frames early and significantly reduce computation.

## Highlights

- Filters out more than 90% of redundant frames before heavy SLAM processing
- Reduces tracking FLOPs by more than 85%
- Achieves around 5x end-to-end throughput speedup
- Maintains competitive tracking and mapping accuracy

## Model

- Checkpoint: `model.pt`
- Framework: PyTorch
- Task: frame utility scoring for monocular SLAM
- Input: current frame and reference/keyframe features or paired image representation
- Output: geometric utility score used for frame gating

## Method

<p align="center">
  <img src="./system.png" width="95%">
</p>

## Quick Start

This release lets you download the public LeanGate checkpoint, run LeanGate on prepared `TUM`, `7SCENES`, or `EUROC` scenes, export sparse RGB manifests, and optionally launch MASt3R-SLAM on the filtered sequence.

### 1. Install
Use `python3` and install a PyTorch version matching your CUDA runtime first.

```bash
pip install -e .
pip install -e third_party/MASt3R-SLAM/thirdparty/mast3r
pip install -e third_party/MASt3R-SLAM/thirdparty/in3d
pip install --no-build-isolation -e third_party/MASt3R-SLAM
```

### 2. Download the released checkpoint
The public LeanGate checkpoint is hosted at:

- Repo: `ShawnX98/LeanGate`
- URL: `https://huggingface.co/ShawnX98/LeanGate`
- File: `leangate.pt`

Download it with:

```bash
python3 scripts/download_checkpoints.py --output-root checkpoints --repo-id ShawnX98/LeanGate
```

This will place the checkpoint at:

```text
checkpoints/leangate.pt
```

### 3. Run LeanGate on a prepared benchmark dataset
Example for `TUM`:

```bash
python3 scripts/generate_rgb_lists.py \
  --dataset-type TUM \
  --dataset-root /data/tum \
  --output-root outputs/predictions \
  --device cuda:0
```

Supported benchmark inputs:

- `TUM`
- `7SCENES`
- `EUROC`

Expected dataset layouts are documented in [`docs/dataset_layouts.md`](docs/dataset_layouts.md).

### 4. Run the plain RGB folder demo
For a simple folder of RGB frames:

```bash
./demo.sh \
  --folder /data/my_rgb_frames \
  --output-root outputs/demo \
  --device cuda:0
```

This processes frames in sorted filename order and writes the filtered manifest to:

```text
outputs/demo/leangate/<folder_name>.txt
```

### 5. Optional: launch MASt3R-SLAM on the sparse sequence
Single scene:

```bash
python3 scripts/run_slam_scene.py \
  --dataset-type TUM \
  --dataset-root /data/tum \
  --scene-id rgbd_dataset_freiburg1_desk \
  --predictions-root outputs/predictions \
  --output-root outputs/slam
```

Full dataset:

```bash
python3 scripts/run_slam_dataset.py \
  --dataset-type TUM \
  --dataset-root /data/tum \
  --predictions-root outputs/predictions \
  --output-root outputs/slam
```

### Outputs
LeanGate inference produces:

- `outputs/predictions/<dataset_slug>/leangate/<scene>.txt`
- `outputs/predictions/<dataset_slug>/leangate/scores/<scene>_scores.csv`

Optional MASt3R-SLAM outputs include:

- `outputs/slam/<dataset_slug>/leangate/<scene>/trajectory_keyframes.tum`
- `outputs/slam/<dataset_slug>/leangate/<scene>/reconstruction.ply`
- `outputs/slam/<dataset_slug>/leangate/<scene>/run_metadata.json`
- `outputs/slam/<dataset_slug>/leangate/summary.csv`
- `outputs/slam/<dataset_slug>/leangate/summary.json`