GS3LAM — Gaussian Semantic Splatting SLAM

Part of the ANIMA Intelligence Compiler Suite by Robot Flow Labs.

Wave 7 | Domain: SLAM | Module: slam-gs3lam

Paper

GS3LAM: Gaussian Semantic Splatting SLAM Linfei Li, Lin Zhang, Zhong Wang, Ying Shen ACM MM 2024 | arXiv 2603.27781 | DOI

Architecture

Dense semantic RGB-D SLAM using a Semantic Gaussian Field (SG-Field). Each Gaussian stores position, covariance, opacity, RGB color, and a 16-dimensional semantic feature vector. A lightweight 1x1 Conv2d decoder maps semantic features to per-pixel class logits.

Core components:

  • SG-Field: 3D Gaussian representation with semantic features
  • Differentiable Splatting: CUDA-accelerated rendering via gaussian-semantic-rasterization
  • Tracking: Frame-to-model pose optimization (rotation + translation)
  • Mapping: Joint optimization of Gaussians + semantic decoder
  • DSR: Depth-adaptive Scale Regularization
  • RSKM: Random Sampling-based Keyframe Mapping

Exported Formats

Format File Use Case
SafeTensors (field) pytorch/gs3lam_office2_v1_field.safetensors Gaussian field parameters
SafeTensors (decoder) pytorch/gs3lam_office2_v1_decoder.safetensors Semantic decoder weights
ONNX (decoder) onnx/gs3lam_office2_v1_decoder.onnx Cross-platform decoder inference
Poses (npy) pytorch/gs3lam_office2_v1_poses.npy Estimated camera trajectory [N,4,4]
Checkpoint checkpoints/best.pt Full checkpoint for resuming

Note: TensorRT engines must be generated on target hardware due to architecture-specific compilation.

Training Details

  • Scene: Replica office2 (2000 frames, 1200x680 rendered at 600x340)
  • Hardware: NVIDIA L4 23GB
  • Config: 20 tracking iterations, 15 mapping iterations (L4-tuned)
  • Gaussians: 21K (capped at 80K with periodic pruning)
  • Duration: ~5.5 hours
  • CUDA Extension: gaussian-semantic-rasterization (sm_89)

Current Metrics (L4 baseline, reduced resolution)

Metric Value Paper Target
PSNR 3.39 dB >= 35.0 dB
ATE 178 cm <= 0.50 cm

Note: These metrics reflect an L4-constrained run at half resolution with aggressive Gaussian capping (21K vs paper's ~775K). Paper-quality reproduction requires full resolution on A100/H100 hardware.

Usage

import torch
from anima_slam_gs3lam.export import load_checkpoint, reconstruct_field, reconstruct_decoder

ckpt = load_checkpoint("checkpoints/best.pt")
field = reconstruct_field(ckpt, device="cuda")
decoder = reconstruct_decoder(ckpt, device="cuda")

API

# Start service
python -m anima_slam_gs3lam

# Health check
curl http://localhost:8080/health
curl http://localhost:8080/ready
curl http://localhost:8080/info

Docker

docker compose -f docker-compose.serve.yml --profile api up -d

CUDA Extension

The gaussian-semantic-rasterization CUDA kernel is shared at: /mnt/forge-data/shared_infra/cuda_extensions/gaussian_semantic_rasterization/

License

Apache 2.0 — Robot Flow Labs / AIFLOW LABS LIMITED

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for ilessio-aiflowlab/project_slam_gs3lam