File size: 8,625 Bytes
e5052bc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | ---
license: cc-by-nc-sa-4.0
language:
- en
tags:
- medical-imaging
- cephalometric
- landmark-detection
- orthodontics
- heatmap-regression
- spatial-priors
- onnx
library_name: onnxruntime
pipeline_tag: image-segmentation
datasets:
- custom
metrics:
- mre
- sdr
model-index:
- name: CephTrace v4
results:
- task:
type: landmark-detection
name: Cephalometric Landmark Detection
dataset:
type: custom
name: Aggregated (ISBI 2015 + Aariz/CEPHA29 + DentalCepha)
config: 25-landmark
split: test
metrics:
- type: mean-radial-error
value: 1.050
name: MRE (mm)
- type: sdr-2mm
value: 87.8
name: SDR@2mm (%)
---
# CephTrace v4 β Anatomy-Guided Cephalometric Landmark Detection
**1.050 mm MRE across 25 landmarks** on a 151-image held-out test set, using image-adaptive spatial priors generated by anatomical analysis of each radiograph.
## Model Description
CephTrace v4 is a two-stage pipeline for automatic cephalometric landmark detection from lateral skull radiographs:
- **Stage 0 (Anatomical Initialization):** A multi-phase module that detects the soft-tissue profile, partitions the image into anatomical zones, extracts bony contours, derives anchor landmarks via geometric rules, and generates 25 per-landmark Gaussian attention maps β all adapted to each patient's individual anatomy.
- **Stage 1 (Heatmap Regression):** An HRNet-W32 backbone (32M params) that accepts the 28-channel input (3 RGB + 25 attention maps) and outputs 25 landmark heatmaps at 256Γ256 resolution.
The key innovation is that the attention priors are **image-adaptive**: each patient receives maps centered at *their* estimated anatomy, not fixed population-average positions. Controlled experiments show this reduces MRE by 30.9% compared to the same architecture without priors.
## ONNX Models
All models are exported as ONNX (opset 14) for cross-platform inference.
| File | Stage | Purpose | Size | Input | Output |
|------|-------|---------|------|-------|--------|
| `v4_stage0_profile.onnx` | 0A | Soft-tissue profile segmentation | 26.8 MB | `(1,1,512,512)` float32 | `(1,1,512,512)` sigmoid mask |
| `z1_cranial_base_contours.onnx` | 0C | Cranial base contour segmentation | 26.8 MB | `(1,1,256,256)` float32 | `(1,1,256,256)` logits |
| `z2_midface_contours.onnx` | 0C | Midface contour segmentation (palatal + upper incisor) | 26.8 MB | `(1,1,256,256)` float32 | `(1,2,256,256)` logits |
| `z3_mandible_contours.onnx` | 0C | Mandible contour segmentation (border + symphysis + lower incisor) | 26.8 MB | `(1,1,256,256)` float32 | `(1,3,256,256)` logits |
| `z4_posterior_contours.onnx` | 0C | Posterior contour segmentation (mandible + cranial base) | 26.8 MB | `(1,1,256,256)` float32 | `(1,2,256,256)` logits |
| `phase0e_model.onnx` | 0E | Anchor β derived landmark MLP | 455 KB | `(1,14)` float32 | `(1,36)` float32 |
| `v4_stage1.onnx` | 1 | HRNet-W32 heatmap regression | 130 MB | `(1,28,512,512)` float32 | `(1,25,256,256)` float32 |
**Total: 264 MB**
## Pipeline Flow
```
Lateral Cephalogram (any resolution)
β
βΌ resize to 512Γ512
Phase 0A βββΊ Soft-tissue profile mask (Dice 0.80)
β
βΌ
Phase 0B βββΊ 5 anatomical zones + 6 soft-tissue landmarks (geometric rules)
β
βΌ per-zone CLAHE enhancement
Phase 0C βββΊ Bony contour masks (4 zone-specific U-Nets)
β
βΌ Douglas-Peucker simplification
Phase 0D βββΊ 7 anchor landmarks (0.11 mm MRE, topological rules)
β
βΌ
Phase 0E βββΊ 18 derived landmarks (MLP, 114K params)
+ 25 Gaussian attention maps (256Γ256, 3-tier Ο)
β
βΌ bilinear upsample to 512, concat with RGB β 28 channels
Stage 1 βββΊ 25 heatmaps (256Γ256) β peak decode β 25 landmarks
```
**Inference time:** ~410 ms total (Stage 0: ~40 ms, Stage 1: ~350 ms) on A100 GPU.
## Landmark Set (25 landmarks, CANONICAL_25 order)
```
0: S (Sella) 1: N (Nasion) 2: Or (Orbitale)
3: Po (Porion) 4: ANS 5: PNS
6: A (Subspinale) 7: B (Supramentale) 8: Pog (Pogonion)
9: Gn (Gnathion) 10: Me (Menton) 11: Go (Gonion)
12: Ar (Articulare) 13: Co (Condylion) 14: U1_tip
15: U1_root 16: L1_tip 17: L1_root
18: UL (Upper Lip) 19: LL (Lower Lip) 20: Pm (Pterygomaxillare)
21: Ba (Basion) 22: Pog_soft 23: Sn (Subnasale)
24: Prn (Pronasale)
```
## Performance
### Controlled Ablation (151-image held-out test set)
| Configuration | Input | MRE (mm) | SDR@2mm |
|---|---|---|---|
| HRNet backbone (no priors) | 3-ch | 1.520 | 86.6% |
| **HRNet + Phase 0E priors** | **28-ch** | **1.050** | **87.8%** |
| **Improvement** | | **0.470 (30.9%)** | **+1.2%** |
Same 1,201 training images, architecture, and recipe. Only variable: prior channels.
### Prior Ablation
| Configuration | MRE (mm) | vs. No Priors |
|---|---|---|
| Random priors (shuffled channels) | 2.240 | +15.6% worse |
| No priors (baseline) | 1.938 | β |
| Fixed textbook priors | 1.869 | β3.6% (marginal) |
| **Image-adaptive priors (Phase 0E)** | **1.043** | **β46.2%** |
### Attention Map Confidence Tiers
| Tier | Ο (at 256Γ256) | Landmarks | Mean Improvement |
|---|---|---|---|
| High | 5β7 | S, N, Me, ANS, Prn, Sn | β0.74 mm |
| Medium | 8β13 | Go, Gn, Pog, Or, UL, LL, Pog', A | β0.44 mm |
| Low | 18β22 | Po, Co, B, PNS, U1r, L1r, Ba, Pm | β0.17 mm |
### Clinical Reliability
- Vertical skeletal classification (FMA): Cohen's ΞΊ = 0.78 (substantial agreement)
- 20/25 landmarks improve with priors; 1 degrades (Basion, lowest confidence tier)
## Usage
```python
import onnxruntime as ort
import numpy as np
import cv2
# Load Stage 1 model
sess = ort.InferenceSession("v4_stage1.onnx")
# Prepare input (28 channels: 3 RGB + 25 attention maps from Stage 0)
image = cv2.imread("cephalogram.jpg")
image_512 = cv2.resize(image, (512, 512))
rgb = image_512.astype(np.float32) / 255.0 # (512, 512, 3)
rgb = np.transpose(rgb, (2, 0, 1)) # (3, 512, 512)
# attention_maps shape: (25, 512, 512) from Stage 0 pipeline
# (See Stage 0 inference code for generating these)
input_28ch = np.concatenate([rgb, attention_maps], axis=0) # (28, 512, 512)
input_tensor = input_28ch[np.newaxis] # (1, 28, 512, 512)
# Run inference
input_name = sess.get_inputs()[0].name
heatmaps = sess.run(None, {input_name: input_tensor})[0] # (1, 25, 256, 256)
# Decode landmarks from heatmap peaks
landmarks = []
for i in range(25):
hm = heatmaps[0, i]
y, x = np.unravel_index(np.argmax(hm), hm.shape)
# Scale from heatmap (256) to image (512) coordinates
landmarks.append((x * 2, y * 2))
```
## Training Data
Aggregated from three public sources (1,502 total images):
| Source | Images | Landmarks | Scanner(s) |
|---|---|---|---|
| [ISBI 2015](https://www-o.ntust.edu.tw/~cweiwang/ISBI2015/challenge1/) | 400 | 19 | Soredex CRANEX |
| [Aariz/CEPHA29](https://doi.org/10.1038/s41597-025-05542-3) | 1,000 | 29 | 7+ device types |
| DentalCepha | 102 | 19 | Mixed |
Split: 1,201 train / 150 validation / 151 test (stratified by source, seed=42).
## Citation
```bibtex
@article{mohapatra2025cephtrace,
title={CephTrace: Anatomy-Guided Spatial Attention Priors for
Sub-Millimeter Cephalometric Landmark Detection},
author={Mohapatra, Sidhartha and Mohanty, Pallavi},
journal={arXiv preprint arXiv:2605.03358},
year={2025},
url={https://arxiv.org/abs/2605.03358}
}
```
## Links
| Resource | URL |
|---|---|
| **Paper** | [arXiv:2605.03358](https://arxiv.org/abs/2605.03358) |
| **Code** | [github.com/sidwiz/cephtrace-research](https://github.com/sidwiz/cephtrace-research) |
| **Data & Weights** | [Zenodo DOI 10.5281/zenodo.20032162](https://doi.org/10.5281/zenodo.20032162) |
| **Website** | [cephtrace.com](https://cephtrace.com) |
## License
This work is licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Commercial use requires a separate license β contact research@cephtrace.com.
Three U.S. provisional patent applications are pending (#64/037,246; #64/037,252; #64/039,042).
## Limitations
- Trained on 2D lateral cephalograms only; not validated on 3D CBCT or PA cephalograms.
- Phase 0A requires visible soft-tissue profile; severely overexposed or cropped images may degrade.
- Basion (Ba) accuracy degrades slightly with priors due to low Phase 0E confidence (Ο=22).
- Cross-source generalization without priors is poor (22β37 mm MRE in LOSO experiments); Phase 0's anatomical analysis provides scanner-invariant features.
|