GS3LAM — Gaussian Semantic Splatting SLAM
Part of the ANIMA Intelligence Compiler Suite by Robot Flow Labs.
Wave 7 | Domain: SLAM | Module: slam-gs3lam
Paper
GS3LAM: Gaussian Semantic Splatting SLAM Linfei Li, Lin Zhang, Zhong Wang, Ying Shen ACM MM 2024 | arXiv 2603.27781 | DOI
Architecture
Dense semantic RGB-D SLAM using a Semantic Gaussian Field (SG-Field). Each Gaussian stores position, covariance, opacity, RGB color, and a 16-dimensional semantic feature vector. A lightweight 1x1 Conv2d decoder maps semantic features to per-pixel class logits.
Core components:
- SG-Field: 3D Gaussian representation with semantic features
- Differentiable Splatting: CUDA-accelerated rendering via gaussian-semantic-rasterization
- Tracking: Frame-to-model pose optimization (rotation + translation)
- Mapping: Joint optimization of Gaussians + semantic decoder
- DSR: Depth-adaptive Scale Regularization
- RSKM: Random Sampling-based Keyframe Mapping
Exported Formats
| Format | File | Use Case |
|---|---|---|
| SafeTensors (field) | pytorch/gs3lam_office2_v1_field.safetensors |
Gaussian field parameters |
| SafeTensors (decoder) | pytorch/gs3lam_office2_v1_decoder.safetensors |
Semantic decoder weights |
| ONNX (decoder) | onnx/gs3lam_office2_v1_decoder.onnx |
Cross-platform decoder inference |
| Poses (npy) | pytorch/gs3lam_office2_v1_poses.npy |
Estimated camera trajectory [N,4,4] |
| Checkpoint | checkpoints/best.pt |
Full checkpoint for resuming |
Note: TensorRT engines must be generated on target hardware due to architecture-specific compilation.
Training Details
- Scene: Replica office2 (2000 frames, 1200x680 rendered at 600x340)
- Hardware: NVIDIA L4 23GB
- Config: 20 tracking iterations, 15 mapping iterations (L4-tuned)
- Gaussians: 21K (capped at 80K with periodic pruning)
- Duration: ~5.5 hours
- CUDA Extension: gaussian-semantic-rasterization (sm_89)
Current Metrics (L4 baseline, reduced resolution)
| Metric | Value | Paper Target |
|---|---|---|
| PSNR | 3.39 dB | >= 35.0 dB |
| ATE | 178 cm | <= 0.50 cm |
Note: These metrics reflect an L4-constrained run at half resolution with aggressive Gaussian capping (21K vs paper's ~775K). Paper-quality reproduction requires full resolution on A100/H100 hardware.
Usage
import torch
from anima_slam_gs3lam.export import load_checkpoint, reconstruct_field, reconstruct_decoder
ckpt = load_checkpoint("checkpoints/best.pt")
field = reconstruct_field(ckpt, device="cuda")
decoder = reconstruct_decoder(ckpt, device="cuda")
API
# Start service
python -m anima_slam_gs3lam
# Health check
curl http://localhost:8080/health
curl http://localhost:8080/ready
curl http://localhost:8080/info
Docker
docker compose -f docker-compose.serve.yml --profile api up -d
CUDA Extension
The gaussian-semantic-rasterization CUDA kernel is shared at:
/mnt/forge-data/shared_infra/cuda_extensions/gaussian_semantic_rasterization/
License
Apache 2.0 — Robot Flow Labs / AIFLOW LABS LIMITED