dwellbot_stream3r / design_docs /voxel_reduction.md
brian4dwell's picture
voxel reduction
31dbf53
# Design Doc: Stream3R → Clean GLB Export with Voxel Reduction + Open3D Outlier Removal

**Author:** Brian Clark  
**Last Updated:** 2025-11-03  
**Target Component:** `predictions_to_glb()` (Stream3R repo)  
**Goal:** Integrate a high-fidelity voxel reduction and denoising stage for cleaner, lighter `.glb` outputs suitable for downstream r3f visualization.

---

## 1. Overview

The existing `predictions_to_glb()` function builds a large unfiltered point cloud directly from Stream3R predictions.  
This design adds two optional cleanup stages that:
- **(1) Voxel-reduce** redundant points into weighted centroids.
- **(2) Filter residual noise** with Open3D’s statistical & radius outlier removal.

This yields cleaner geometry, lower GLB size, and faster loading, while preserving color and geometry fidelity.

---

## 2. Key Additions

### 2.1 New helpers

#### `voxel_reduce(points_f32, colors_u8, conf_f32, voxel_size)`
- Merges points falling within the same voxel grid cell.
- Weighted average of position, color (in *linear* space), and optional confidence.
- Returns `(points, colors, support)`.

#### `o3d_outlier_filter(points_f32, colors_u8, voxel_size, radius_mult, nb_points, nb_neighbors, std_ratio)`
- Converts NumPy arrays → Open3D `PointCloud`.
- Applies:
  1. **Radius Outlier Removal:** ensures local density.
  2. **Statistical Outlier Removal:** drops sparse noise.
- Converts filtered result back to NumPy + `uint8` sRGB.

### 2.2 Optional parameters added to `predictions_to_glb()`
| Parameter | Type | Default | Description |
|------------|------|----------|--------------|
| `voxel_size` | float \| None | None | Enables voxel reduction if >0 |
| `voxel_after_conf` | bool | True | Reduce after confidence & mask filtering |
| `o3d_denoise` | bool | True | Enables Open3D outlier filtering |
| `o3d_params` | dict \| None | None | Override Open3D defaults (e.g. radius_mult) |

### 2.3 Processing order

predictions → confidence/bg masking → [optional] voxel_reduce() → [optional] o3d_outlier_filter() → trimesh.PointCloud → GLB


Both stages are fully optional; default behavior is unchanged.

---

## 3. Implementation Plan

### 3.1 Import dependencies
```python
import open3d as o3d     # optional heavy dependency
import numpy as np
import trimesh

3.2 Helper: voxel_reduce()

def voxel_reduce(points_f32, colors_u8, conf_f32=None, voxel_size=0.02, origin=None):
    # sRGB→linear, weighted average, linear→sRGB
    # Hash each voxel using large primes (avoids collisions)
    # Return reduced arrays

3.3 Helper: o3d_outlier_filter()

def o3d_outlier_filter(points_f32, colors_u8,
                       voxel_size=0.02,
                       radius_mult=3.0,
                       nb_points=16,
                       nb_neighbors=48,
                       std_ratio=1.5):
    # Construct Open3D cloud, remove outliers, return filtered arrays

3.4 Patch in predictions_to_glb()

Insert after confidence masking:

if voxel_size is not None and voxel_size > 0:
    vertices_3d, colors_rgb, _support = voxel_reduce(
        vertices_3d, colors_rgb, conf_f32=conf_used, voxel_size=float(voxel_size)
    )

if o3d_denoise and vertices_3d.size:
    params = dict(
        voxel_size=float(voxel_size or 0.02),
        radius_mult=3.0,
        nb_points=16,
        nb_neighbors=48,
        std_ratio=1.5,
    )
    if o3d_params: params.update(o3d_params)
    vertices_3d, colors_rgb = o3d_outlier_filter(vertices_3d, colors_rgb, **params)

The rest of the GLB creation (scene scale, camera meshes, alignment) remains unchanged.


4. Default Parameters & Behavior

Context Setting Recommended
Indoor scenes voxel_size=0.02 2 cm grid
Fast preview voxel_size=0.06 Coarse 6 cm grid
Radius filter radius = 3×voxel_size 0.06 m for 2 cm grid
Statistical filter nb_neighbors=48, std_ratio=1.5 Safe defaults
Weighting Confidence scores (world_points_conf) Use for averages

5. Expected Outcomes

Metric Before After
GLB file size ↓ 3–8×
Visual duplicates High Minimal
Noise/speckle Frequent Strongly reduced
Load time (r3f viewer) Long Near-instant
Fidelity Unchanged Preserved

6. Validation Steps

  1. Run baseline: predictions_to_glb(preds, voxel_size=None, o3d_denoise=False) → export size / load time baseline.

  2. Run optimized: predictions_to_glb(preds, voxel_size=0.02, o3d_denoise=True) → compare GLB size, visual quality, and FPS in viewer.

  3. Stress-test:

    • High-conf scenes with many frames.
    • Scenes with thin structures (shelves, walls).
    • Ensure no noticeable geometric bias or color shift.

7. Future Extensions

Feature Description
Normals averaging Extend voxel_reduce() to merge normals & store as GLB attributes
Support weighting Save per-voxel support count → possible LOD weighting
Covariance export Optionally compute per-voxel covariance for Gaussian splats
Tile-based batch Enable out-of-core fusion for huge rooms
Dual GLB export Auto-save coarse (preview) + fine (full-res) versions

8. Example Usage

scene = predictions_to_glb(
    preds,
    conf_thres=50.0,
    mask_white_bg=True,
    voxel_size=0.02,
    o3d_denoise=True,
    o3d_params={"nb_neighbors": 64, "std_ratio": 1.3}
)
trimesh.exchange.gltf.export_glb(scene, "room_clean.glb")

9. Deliverables for Codex Implementation

  1. New helper functions

    • voxel_reduce()
    • o3d_outlier_filter()
  2. Modified signature of predictions_to_glb() to include new optional args.

  3. Integration of both steps before trimesh.PointCloud.

  4. Minimal dependency injection (open3d imported lazily; safe fail if missing).

  5. Unit test / validation script

    • Compare point counts & file sizes pre-/post-cleanup.
    • Assert geometry type remains PointCloud.

Outcome: A drop-in replacement for predictions_to_glb() producing denser, visually identical but much smaller and cleaner .glb point clouds for Stream3R → r3f workflows.