MUNINN (SD-GS) — ANIMA Wave-6 Module

Part of the ANIMA Perception Suite by Robot Flow Labs.

Paper

SD-GS: Structured Deformable 3D Gaussians for Efficient Dynamic Scene Reconstruction Wei Yao, Shuzhao Xie, Letian Li, Weixiang Zhang, Zhixin Lai, Shiqi Dai, Ke Zhang, Zhi Wang arXiv:2507.07465 (Jul 2025)

Architecture

MUNINN implements a hierarchical deformable anchor grid for 4D Gaussian Splatting:

  • Deformable Anchor Grid: 48x48x48 grid (110,592 anchors, 442,368 Gaussians) covering the scene bounding box
  • Spatiotemporal Deformation Field: MLP predicting per-anchor position/scale/rotation/opacity offsets per frame
  • Anchor-to-Gaussian Derivation: Each anchor generates 4 local Gaussians via learned offsets
  • CUDA Rasterization: diff-gaussian-rasterization kernel for real-time rendering

Key results: 60% model size reduction, 100% FPS improvement over dense 4DGS, maintained visual quality.

D-NeRF Benchmark Results

Scene val_loss Format Sizes
bouncingballs 0.449 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
trex 0.459 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
hook 0.469 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
mutant 0.470 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
lego 0.471 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
jumpingjacks 0.475 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
standup 0.475 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB
hellwarrior 0.487 pth: 40MB, ONNX: 7.7MB, TRT: 7.7MB

Exported Formats (per scene)

Format File Pattern Use Case
PyTorch (.pth) pytorch/muninn_{scene}_v1.pth Training, fine-tuning
SafeTensors pytorch/muninn_{scene}_v1.safetensors Fast loading, safe
ONNX onnx/muninn_{scene}_v1.onnx Cross-platform inference
TensorRT FP16 tensorrt/muninn_{scene}_v1_fp16.trt Edge deployment (Jetson/L4)
TensorRT FP32 tensorrt/muninn_{scene}_v1_fp32.trt Full precision inference

Usage

import torch
from anima_muninn.core.model import MuninModel

# Load from SafeTensors
from safetensors.torch import load_file
state = load_file("pytorch/muninn_hellwarrior_v1.safetensors")
model = MuninModel(bbox=(...), grid_size=48, ...)
model.load_state_dict(state)
model.eval()

# Render a frame
output = model(poses, intrinsics, times)
rendered_image = output["rendered"]  # (B, 3, 800, 800)

Training

  • Hardware: 5x NVIDIA L4 (23GB each)
  • Framework: PyTorch 2.5.1 + CUDA 12.1
  • Grid: 48x48x48 anchors, 4 Gaussians per anchor
  • Batch size: 36 (71% VRAM)
  • Optimizer: Adam (lr=0.0016, gamma=0.95)
  • Early stopping: patience=200 epochs
  • Config: See configs/dnerf.toml

ANIMA Stack Position

  • Tier: 2 (Perception)
  • Wave: 6
  • Upstream: THOR (poses), FREYA (point clouds)
  • Downstream: SURT (occupancy), MAGNI (temporal coherence)
  • Sibling: HUGINN (complementary 4DGS compression)

License

Apache 2.0 -- Robot Flow Labs / AIFLOW LABS LIMITED

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for ilessio-aiflowlab/project_muninn