google
/

tipsv2-l14-dpt

Depth Estimation

feature-extraction

surface-normals

semantic-segmentation

dense-prediction

Model card Files Files and versions

Gabriele commited on 6 days ago

Commit

64789f3

·

1 Parent(s): 1241b68

Update README

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ pipeline_tag: depth-estimation
 # TIPSv2 — L/14 DPT Heads
-DPT (Dense Prediction Transformer) heads for depth estimation, surface normal prediction, and semantic segmentation (ADE20K, 150 classes) on top of the [TIPSv2 L/14](https://huggingface.co/google/tipsv2-l14) backbone. The backbone is loaded automatically.
 ## Usage
@@ -47,6 +47,7 @@ seg = model.predict_segmentation(pixel_values)
 - **Backbone**: [TIPSv2 L/14](google/tipsv2-l14) (loaded automatically)
 - **Heads**: ~102M total params (depth + normals + segmentation)
 - **Segmentation**: ADE20K, 150 classes
 - **Input**: images in `[0, 1]` range, any resolution (multiples of 14 recommended)

 # TIPSv2 — L/14 DPT Heads
+DPT (Dense Prediction Transformer) heads for depth estimation, surface normal prediction, and semantic segmentation on top of the frozen [TIPSv2 L/14](https://huggingface.co/google/tipsv2-l14) backbone. The backbone is loaded automatically. The depth and normals heads are trained on the NYU Depth V2 dataset and segmentation is trained on the ADE20K dataset (150 classes).
 ## Usage
 - **Backbone**: [TIPSv2 L/14](google/tipsv2-l14) (loaded automatically)
 - **Heads**: ~102M total params (depth + normals + segmentation)
+- **Depth & normals**: NYU Depth V2
 - **Segmentation**: ADE20K, 150 classes
 - **Input**: images in `[0, 1]` range, any resolution (multiples of 14 recommended)