metadata
library_name: peft
tags:
- depth-estimation
- lora
- peft
- metric3d
- diode
- indoor
license: apache-2.0
Metric3D ViT-giant2 — LoRA adapter (DIODE Indoors)
Base model:
metric3d_vit_giant2loaded viatorch.hub.load("yvanyin/metric3d", "metric3d_vit_giant2")
LoRA fine-tuning of Metric3D v2
(metric3d_vit_giant2) on the DIODE indoor split (metric depth in metres).
Training details
| Base model | metric3d_vit_giant2 (torch.hub yvanyin/metric3d) |
| Dataset | the DIODE indoor split (metric depth in metres) |
| Loss | Direct metric-depth L1 + gradient loss |
| LoRA rank / alpha | 16 / 32 |
| LoRA targets | qkv, proj |
Usage
import torch
from peft import PeftModel
import torch.nn as nn
import torch.utils.checkpoint as torch_checkpoint
# Load base model
model = torch.hub.load("yvanyin/metric3d", "metric3d_vit_giant2",
pretrain=True, trust_repo=True)
# Apply the same gradient-checkpointing wrapper used during training
# (needed so PEFT key names match the saved adapter)
def enable_gradient_checkpointing(model):
try:
encoder = model.depth_model.encoder
except AttributeError:
encoder = model.base_model.model.depth_model.encoder
class _CheckpointedBlock(nn.Module):
def __init__(self, block):
super().__init__()
self.block = block
def forward(self, x):
return torch_checkpoint.checkpoint(self.block, x, use_reentrant=False)
for blk_group in encoder.blocks:
for key in list(blk_group._modules.keys()):
blk_group._modules[key] = _CheckpointedBlock(blk_group._modules[key])
enable_gradient_checkpointing(model)
model = PeftModel.from_pretrained(model, "igzi/depth-lora-checkpoints_diode-diode_indoors")
model.eval()
# Inference: input pixel_values shape (B, 3, 616, 1064), values normalised
# with mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]
pred_canonical, _, _ = model({"input": pixel_values})
# De-canonicalise: pred_metric = pred_canonical * (fx_scaled / 1000)