Unified foundation model for promptable segmentation
Complex text label dection using SAM3 with VLM-FO1
Generate depth map from any photo
pixel-perfect monocular depth estimation
Detect and mark landmarks on anime faces
Identify and highlight objects in anime images