File size: 8,625 Bytes
e5052bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
---
license: cc-by-nc-sa-4.0
language:
- en
tags:
- medical-imaging
- cephalometric
- landmark-detection
- orthodontics
- heatmap-regression
- spatial-priors
- onnx
library_name: onnxruntime
pipeline_tag: image-segmentation
datasets:
- custom
metrics:
- mre
- sdr
model-index:
- name: CephTrace v4
  results:
  - task:
      type: landmark-detection
      name: Cephalometric Landmark Detection
    dataset:
      type: custom
      name: Aggregated (ISBI 2015 + Aariz/CEPHA29 + DentalCepha)
      config: 25-landmark
      split: test
    metrics:
    - type: mean-radial-error
      value: 1.050
      name: MRE (mm)
    - type: sdr-2mm
      value: 87.8
      name: SDR@2mm (%)
---

# CephTrace v4 β€” Anatomy-Guided Cephalometric Landmark Detection

**1.050 mm MRE across 25 landmarks** on a 151-image held-out test set, using image-adaptive spatial priors generated by anatomical analysis of each radiograph.

## Model Description

CephTrace v4 is a two-stage pipeline for automatic cephalometric landmark detection from lateral skull radiographs:

- **Stage 0 (Anatomical Initialization):** A multi-phase module that detects the soft-tissue profile, partitions the image into anatomical zones, extracts bony contours, derives anchor landmarks via geometric rules, and generates 25 per-landmark Gaussian attention maps β€” all adapted to each patient's individual anatomy.
- **Stage 1 (Heatmap Regression):** An HRNet-W32 backbone (32M params) that accepts the 28-channel input (3 RGB + 25 attention maps) and outputs 25 landmark heatmaps at 256Γ—256 resolution.

The key innovation is that the attention priors are **image-adaptive**: each patient receives maps centered at *their* estimated anatomy, not fixed population-average positions. Controlled experiments show this reduces MRE by 30.9% compared to the same architecture without priors.

## ONNX Models

All models are exported as ONNX (opset 14) for cross-platform inference.

| File | Stage | Purpose | Size | Input | Output |
|------|-------|---------|------|-------|--------|
| `v4_stage0_profile.onnx` | 0A | Soft-tissue profile segmentation | 26.8 MB | `(1,1,512,512)` float32 | `(1,1,512,512)` sigmoid mask |
| `z1_cranial_base_contours.onnx` | 0C | Cranial base contour segmentation | 26.8 MB | `(1,1,256,256)` float32 | `(1,1,256,256)` logits |
| `z2_midface_contours.onnx` | 0C | Midface contour segmentation (palatal + upper incisor) | 26.8 MB | `(1,1,256,256)` float32 | `(1,2,256,256)` logits |
| `z3_mandible_contours.onnx` | 0C | Mandible contour segmentation (border + symphysis + lower incisor) | 26.8 MB | `(1,1,256,256)` float32 | `(1,3,256,256)` logits |
| `z4_posterior_contours.onnx` | 0C | Posterior contour segmentation (mandible + cranial base) | 26.8 MB | `(1,1,256,256)` float32 | `(1,2,256,256)` logits |
| `phase0e_model.onnx` | 0E | Anchor β†’ derived landmark MLP | 455 KB | `(1,14)` float32 | `(1,36)` float32 |
| `v4_stage1.onnx` | 1 | HRNet-W32 heatmap regression | 130 MB | `(1,28,512,512)` float32 | `(1,25,256,256)` float32 |

**Total: 264 MB**

## Pipeline Flow

```
Lateral Cephalogram (any resolution)
    β”‚
    β–Ό resize to 512Γ—512
Phase 0A ──► Soft-tissue profile mask (Dice 0.80)
    β”‚
    β–Ό
Phase 0B ──► 5 anatomical zones + 6 soft-tissue landmarks (geometric rules)
    β”‚
    β–Ό per-zone CLAHE enhancement
Phase 0C ──► Bony contour masks (4 zone-specific U-Nets)
    β”‚
    β–Ό Douglas-Peucker simplification
Phase 0D ──► 7 anchor landmarks (0.11 mm MRE, topological rules)
    β”‚
    β–Ό
Phase 0E ──► 18 derived landmarks (MLP, 114K params)
             + 25 Gaussian attention maps (256Γ—256, 3-tier Οƒ)
    β”‚
    β–Ό bilinear upsample to 512, concat with RGB β†’ 28 channels
Stage 1  ──► 25 heatmaps (256Γ—256) β†’ peak decode β†’ 25 landmarks
```

**Inference time:** ~410 ms total (Stage 0: ~40 ms, Stage 1: ~350 ms) on A100 GPU.

## Landmark Set (25 landmarks, CANONICAL_25 order)

```
 0: S (Sella)           1: N (Nasion)          2: Or (Orbitale)
 3: Po (Porion)         4: ANS                 5: PNS
 6: A (Subspinale)      7: B (Supramentale)    8: Pog (Pogonion)
 9: Gn (Gnathion)      10: Me (Menton)        11: Go (Gonion)
12: Ar (Articulare)    13: Co (Condylion)     14: U1_tip
15: U1_root            16: L1_tip             17: L1_root
18: UL (Upper Lip)     19: LL (Lower Lip)     20: Pm (Pterygomaxillare)
21: Ba (Basion)        22: Pog_soft           23: Sn (Subnasale)
24: Prn (Pronasale)
```

## Performance

### Controlled Ablation (151-image held-out test set)

| Configuration | Input | MRE (mm) | SDR@2mm |
|---|---|---|---|
| HRNet backbone (no priors) | 3-ch | 1.520 | 86.6% |
| **HRNet + Phase 0E priors** | **28-ch** | **1.050** | **87.8%** |
| **Improvement** | | **0.470 (30.9%)** | **+1.2%** |

Same 1,201 training images, architecture, and recipe. Only variable: prior channels.

### Prior Ablation

| Configuration | MRE (mm) | vs. No Priors |
|---|---|---|
| Random priors (shuffled channels) | 2.240 | +15.6% worse |
| No priors (baseline) | 1.938 | β€” |
| Fixed textbook priors | 1.869 | βˆ’3.6% (marginal) |
| **Image-adaptive priors (Phase 0E)** | **1.043** | **βˆ’46.2%** |

### Attention Map Confidence Tiers

| Tier | Οƒ (at 256Γ—256) | Landmarks | Mean Improvement |
|---|---|---|---|
| High | 5–7 | S, N, Me, ANS, Prn, Sn | βˆ’0.74 mm |
| Medium | 8–13 | Go, Gn, Pog, Or, UL, LL, Pog', A | βˆ’0.44 mm |
| Low | 18–22 | Po, Co, B, PNS, U1r, L1r, Ba, Pm | βˆ’0.17 mm |

### Clinical Reliability

- Vertical skeletal classification (FMA): Cohen's ΞΊ = 0.78 (substantial agreement)
- 20/25 landmarks improve with priors; 1 degrades (Basion, lowest confidence tier)

## Usage

```python
import onnxruntime as ort
import numpy as np
import cv2

# Load Stage 1 model
sess = ort.InferenceSession("v4_stage1.onnx")

# Prepare input (28 channels: 3 RGB + 25 attention maps from Stage 0)
image = cv2.imread("cephalogram.jpg")
image_512 = cv2.resize(image, (512, 512))
rgb = image_512.astype(np.float32) / 255.0  # (512, 512, 3)
rgb = np.transpose(rgb, (2, 0, 1))  # (3, 512, 512)

# attention_maps shape: (25, 512, 512) from Stage 0 pipeline
# (See Stage 0 inference code for generating these)
input_28ch = np.concatenate([rgb, attention_maps], axis=0)  # (28, 512, 512)
input_tensor = input_28ch[np.newaxis]  # (1, 28, 512, 512)

# Run inference
input_name = sess.get_inputs()[0].name
heatmaps = sess.run(None, {input_name: input_tensor})[0]  # (1, 25, 256, 256)

# Decode landmarks from heatmap peaks
landmarks = []
for i in range(25):
    hm = heatmaps[0, i]
    y, x = np.unravel_index(np.argmax(hm), hm.shape)
    # Scale from heatmap (256) to image (512) coordinates
    landmarks.append((x * 2, y * 2))
```

## Training Data

Aggregated from three public sources (1,502 total images):

| Source | Images | Landmarks | Scanner(s) |
|---|---|---|---|
| [ISBI 2015](https://www-o.ntust.edu.tw/~cweiwang/ISBI2015/challenge1/) | 400 | 19 | Soredex CRANEX |
| [Aariz/CEPHA29](https://doi.org/10.1038/s41597-025-05542-3) | 1,000 | 29 | 7+ device types |
| DentalCepha | 102 | 19 | Mixed |

Split: 1,201 train / 150 validation / 151 test (stratified by source, seed=42).

## Citation

```bibtex
@article{mohapatra2025cephtrace,
  title={CephTrace: Anatomy-Guided Spatial Attention Priors for
         Sub-Millimeter Cephalometric Landmark Detection},
  author={Mohapatra, Sidhartha and Mohanty, Pallavi},
  journal={arXiv preprint arXiv:2605.03358},
  year={2025},
  url={https://arxiv.org/abs/2605.03358}
}
```

## Links

| Resource | URL |
|---|---|
| **Paper** | [arXiv:2605.03358](https://arxiv.org/abs/2605.03358) |
| **Code** | [github.com/sidwiz/cephtrace-research](https://github.com/sidwiz/cephtrace-research) |
| **Data & Weights** | [Zenodo DOI 10.5281/zenodo.20032162](https://doi.org/10.5281/zenodo.20032162) |
| **Website** | [cephtrace.com](https://cephtrace.com) |

## License

This work is licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Commercial use requires a separate license β€” contact research@cephtrace.com.

Three U.S. provisional patent applications are pending (#64/037,246; #64/037,252; #64/039,042).

## Limitations

- Trained on 2D lateral cephalograms only; not validated on 3D CBCT or PA cephalograms.
- Phase 0A requires visible soft-tissue profile; severely overexposed or cropped images may degrade.
- Basion (Ba) accuracy degrades slightly with priors due to low Phase 0E confidence (Οƒ=22).
- Cross-source generalization without priors is poor (22–37 mm MRE in LOSO experiments); Phase 0's anatomical analysis provides scanner-invariant features.