Unified foundation model for promptable segmentation
Complex text label dection using SAM3 with VLM-FO1
Generate depth map from any photo
pixel-perfect monocular depth estimation