File size: 4,342 Bytes
d6fe0ea
 
 
 
 
 
 
 
 
 
 
 
1ac74f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
language: en
license: apache-2.0
tags:
  - depth-estimation
  - computer-vision
  - pytorch
  - absolute depth
pipeline_tag: depth-estimation
library_name: transformers
---

# Depth-CHM Model

A fine-tuned Depth Anything V2 model for depth estimation, trained on forest canopy height data.

## Model Description

This model is based on [Depth-Anything-V2-Metric-Indoor-Base](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Indoor-Base-hf) and fine-tuned for estimating depth/canopy height from aerial imagery.

### Training Details

- **Base Model**: depth-anything/Depth-Anything-V2-Metric-Indoor-Base-hf
- **Max Depth**: 40.0 meters
- **Loss Function**: SiLog + 0.1 * L1 Loss
- **Hyperparameter Tuning**: Optuna (50 trials)

## Installation

```bash
pip install transformers torch pillow numpy
```

## Usage

### Method 1: Using Pipeline (Recommended)

The simplest way to use the model:

```python
from transformers import pipeline
from PIL import Image
import numpy as np

# Load pipeline
pipe = pipeline(task="depth-estimation", model="Boxiang/depth_chm")

# Load image
image = Image.open("your_image.png").convert("RGB")

# Run inference
result = pipe(image)
depth_image = result["depth"]  # PIL Image (normalized 0-255)

# Convert to numpy array and scale to actual depth (0-40m)
max_depth = 40.0
depth = np.array(depth_image).astype(np.float32) / 255.0 * max_depth

print(f"Depth shape: {depth.shape}")
print(f"Depth range: [{depth.min():.2f}, {depth.max():.2f}] meters")
```

### Method 2: Using AutoImageProcessor + Model

For more control over the inference process:

```python
import torch
import torch.nn.functional as F
from transformers import AutoImageProcessor, DepthAnythingForDepthEstimation
from PIL import Image
import numpy as np

# Configuration
model_id = "Boxiang/depth_chm"
max_depth = 40.0

# Load model and processor
processor = AutoImageProcessor.from_pretrained(model_id)
model = DepthAnythingForDepthEstimation.from_pretrained(model_id)

# Use GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model.eval()

# Load and process image
image = Image.open("your_image.png").convert("RGB")
original_size = image.size  # (width, height)

# Prepare input
inputs = processor(images=image, return_tensors="pt")
pixel_values = inputs["pixel_values"].to(device)

# Run inference
with torch.no_grad():
    outputs = model(pixel_values)
    predicted_depth = outputs.predicted_depth

    # Scale by max_depth
    pred_scaled = predicted_depth * max_depth

    # Resize to original image size
    depth = F.interpolate(
        pred_scaled.unsqueeze(0),
        size=(original_size[1], original_size[0]),  # (height, width)
        mode="bilinear",
        align_corners=True
    ).squeeze().cpu().numpy()

print(f"Depth shape: {depth.shape}")
print(f"Depth range: [{depth.min():.2f}, {depth.max():.2f}] meters")
```

### Method 3: Local Model Path

If you have the model saved locally:

```python
from transformers import AutoImageProcessor, DepthAnythingForDepthEstimation

# Load from local path
model_path = "./depth_chm_trained"
processor = AutoImageProcessor.from_pretrained(model_path, local_files_only=True)
model = DepthAnythingForDepthEstimation.from_pretrained(model_path, local_files_only=True)
```

## Output Format

- **Pipeline output**: Returns a PIL Image with normalized depth values (0-255). Multiply by `max_depth / 255.0` to get actual depth in meters.
- **Model output**: Returns `predicted_depth` tensor with values in range [0, 1]. Multiply by `max_depth` (40.0) to get actual depth in meters.

## Depth vs Height Conversion

The model outputs **depth** (distance from camera). To convert to **height** (like CHM - Canopy Height Model):

```python
height = max_depth - depth
```

## Model Files

- `model.safetensors` - Model weights
- `config.json` - Model configuration
- `preprocessor_config.json` - Image processor configuration
- `training_info.json` - Training hyperparameters

## Citation

If you use this model, please cite:

```bibtex
@misc{depth_chm_2024,
  title={Depth-CHM: Fine-tuned Depth Anything V2 for Canopy Height Estimation},
  author={Boxiang},
  year={2024},
  url={https://huggingface.co/Boxiang/depth_chm}
}
```

## License

This model inherits the license from the base Depth Anything V2 model.