Add base_model: facebook/EUPE-ViT-B and link backbone in prose
Browse files
README.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
| 3 |
tags:
|
| 4 |
- depth-estimation
|
| 5 |
- frozen-backbone
|
|
@@ -16,4 +17,4 @@ A systematic study of monocular depth head architectures operating on frozen vis
|
|
| 16 |
|
| 17 |
Standard practice treats the backbone and depth decoder as a joint system. Recent universal encoders produce spatial features of sufficient quality that the backbone can remain frozen while a lightweight head is trained on depth data. Under this regime, the head is the only variable.
|
| 18 |
|
| 19 |
-
This repository contains an arena framework for rapid comparison of depth head candidates and a collection of architectures spanning conventional decoders through novel minimal-parameter designs. All heads consume the same spatial feature tensor and produce per-pixel depth maps. The backbone is
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
base_model: facebook/EUPE-ViT-B
|
| 4 |
tags:
|
| 5 |
- depth-estimation
|
| 6 |
- frozen-backbone
|
|
|
|
| 17 |
|
| 18 |
Standard practice treats the backbone and depth decoder as a joint system. Recent universal encoders produce spatial features of sufficient quality that the backbone can remain frozen while a lightweight head is trained on depth data. Under this regime, the head is the only variable.
|
| 19 |
|
| 20 |
+
This repository contains an arena framework for rapid comparison of depth head candidates and a collection of architectures spanning conventional decoders through novel minimal-parameter designs. All heads consume the same spatial feature tensor and produce per-pixel depth maps. The reference backbone is [EUPE-ViT-B](https://huggingface.co/facebook/EUPE-ViT-B) (86M parameters, frozen), but the framework is backbone-agnostic — the same heads can be evaluated against any frozen ViT that produces a stride-16 spatial feature grid.
|