File size: 6,028 Bytes
f1d96ff 958f6a3 f1d96ff 2c785f3 f1d96ff | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | ---
license: cc-by-4.0
tags:
- pytorch
- computer-vision
- remote-sensing
- mars
- dem-prediction
- u-net
- multi-task-learning
datasets:
- ESA-Datalabs/MCTED
---
# MarsDEMNet
MarsDEMNet is a comparative deep learning study for single-image Digital Elevation Model (DEM) prediction from Mars CTX satellite imagery. Four architectures are evaluated, a classical Random Forest baseline, a single-output U-Net, a multi-output U-Net with multi-task learning, and an encoder depth ablation — all trained on the MCTED dataset of 80,898 paired CTX orthoimage and DEM patches.
## Model Details
### Model Description
MarsDEMNet addresses a fundamental coverage asymmetry on Mars: while the CTX instrument has photographed ~99.5% of the Martian surface at 5–6 m/pixel, high-resolution stereo DEMs exist for only ~0.5–1% of that coverage. Models trained on MCTED learn to predict dense elevation maps from single optical images, extending effective DEM coverage to nearly the entire planet.
- **Model type:** Convolutional encoder-decoder (U-Net)
- **License:** CC-BY 4.0
- **Finetuned from:** Trained from scratch — no pretrained weights
### Model Sources
- **Repository:** https://github.com/harshithkethavath/MarsDEMNet
- **Dataset:** https://huggingface.co/datasets/ESA-Datalabs/MCTED
## Checkpoints
Four model checkpoints are provided:
| File | Architecture | Val RMSE | Val MAE | Delta-1 |
|---|---|---|---|---|
| `marsdеmnet-unet-elevation-4block.pt` | Single-output U-Net, 4-block encoder, 7.8M params | 74.38m | 52.86m | 0.418 |
| `marsdеmnet-unet-multitask-4block.pt` | Multi-output U-Net, 4-block encoder, 7.8M params | 74.29m | 52.68m | 0.422 |
| `marsdеmnet-unet-multitask-3block.pt` | Multi-output U-Net, 3-block encoder, 1.9M params | 82.80m | 58.29m | 0.440 |
| `marsdеmnet-unet-multitask-5block.pt` | Multi-output U-Net, 5-block encoder, 31.4M params | 59.88m | 42.67m | 0.409 |
The 5-block multi-output model is the best overall, achieving 19% lower RMSE than the 4-block baseline with no overfitting observed.
## How to Get Started
```python
import torch
from scripts.deeplearning.unet import UNet
# Single-output (elevation only) — 4-block
model = UNet(in_channels=1, out_channels=1, num_blocks=4, base_ch=32)
ckpt = torch.load("marsdеmnet-unet-elevation-4block.pt", map_location="cpu")
model.load_state_dict(ckpt["model_state"])
model.eval()
# Multi-output (elevation + slope + roughness) — 5-block (best)
model = UNet(in_channels=1, out_channels=3, num_blocks=5, base_ch=32)
ckpt = torch.load("marsdеmnet-unet-multitask-5block.pt", map_location="cpu")
model.load_state_dict(ckpt["model_state"])
model.eval()
# Inference
with torch.no_grad():
# optical: (1, 1, 518, 518) normalized CTX patch
pred = model(optical)
# Single-output: pred shape (1, 1, 518, 518) — elevation
# Multi-output: pred shape (1, 3, 518, 518) — [elevation, slope, roughness]
```
Input normalization: clip to 2nd–98th percentile, then z-score per patch. DEM targets are mean-subtracted per patch (relative elevation in meters).
## Training Details
### Training Data
MCTED (Mars CTX Terrain-Elevation Dataset) — 80,898 paired CTX orthoimage and DEM patches derived from 1,122 quality-filtered stereo scenes. Geography-aware train/val split at the scene level to prevent spatial leakage. Train: 65,090 patches. Val: 15,808 patches.
### Training Procedure
- **Optimizer:** AdamW, lr=1e-4, weight_decay=1e-4
- **Schedule:** Cosine annealing to 1e-6 over 50 epochs
- **Early stopping:** Patience 10 on val RMSE
- **Batch size:** 16
- **Augmentation:** Random horizontal/vertical flips and 90° rotations applied jointly to image and labels
- **Loss:** Masked MAE (single-output); weighted sum of masked MAE losses (multi-output, uniform 1:1:1 weights)
- **Training regime:** fp32
- **Hardware:** NVIDIA H100 GPU
### Preprocessing
- CTX patches: percentile clip (2nd–98th) + per-patch z-score normalization
- DEM patches: per-patch mean subtraction (relative elevation)
- Validity masking: logical AND of NaN mask and deviation mask; invalid pixels excluded from loss and metrics
## Evaluation
### Metrics
- **MAE** — mean absolute elevation error in meters
- **RMSE** — primary ranking metric; penalizes large errors
- **Delta-1** — fraction of valid pixels where max(pred/gt, gt/pred) < 1.25
### Results
| Model | Params | Val RMSE | Val MAE | Delta-1 |
|---|---|---|---|---|
| Random Forest (classical baseline) | — | 58.39m (elev std) | 41.29m | — |
| Single-output U-Net (4-block) | 7.8M | 74.38m | 52.86m | 0.418 |
| Multi-output U-Net uniform (4-block) | 7.8M | 74.29m | 52.68m | 0.422 |
| Multi-output U-Net (3-block ablation) | 1.9M | 82.80m | 58.29m | 0.440 |
| Multi-output U-Net (5-block ablation) | 31.4M | **59.88m** | **42.67m** | 0.409 |
## Bias, Risks, and Limitations
- Models are trained on regions of Mars where stereo DEMs exist, which are geographically biased toward scientifically interesting terrain. Performance on flat, featureless plains may be lower.
- Textureless terrain with no illumination gradient provides no depth cue, a known failure mode.
- Predictions are relative elevation (mean-subtracted per patch), not absolute MOLA-referenced altitude.
- Not suitable for safety-critical mission planning without further validation.
## Technical Specifications
### Model Architecture
U-Net encoder-decoder with configurable depth. Each encoder block: Conv2d(3×3) → BatchNorm → ReLU × 2 → MaxPool. Decoder: bilinear upsampling + lateral skip connections. Multi-output variant has three separate 1×1 conv heads for elevation, slope, and roughness.
## Citation
If you use MarsDEMNet, please cite:
```bibtex
@misc{marsdеmnet2026,
title = {MarsDEMNet: Classical and Deep Learning Approaches for Single-Image Digital Elevation Model Prediction from Mars CTX Imagery},
author = {Harshith Kethavath},
year = {2026},
publisher = {GitHub},
url = {https://github.com/harshithkethavath/MarsDEMNet}
}
``` |