Upload README.md

e5052bc verified 1 day ago

8.63 kB

	---
	license: cc-by-nc-sa-4.0
	language:
	- en
	tags:
	- medical-imaging
	- cephalometric
	- landmark-detection
	- orthodontics
	- heatmap-regression
	- spatial-priors
	- onnx
	library_name: onnxruntime
	pipeline_tag: image-segmentation
	datasets:
	- custom
	metrics:
	- mre
	- sdr
	model-index:
	- name: CephTrace v4
	results:
	- task:
	type: landmark-detection
	name: Cephalometric Landmark Detection
	dataset:
	type: custom
	name: Aggregated (ISBI 2015 + Aariz/CEPHA29 + DentalCepha)
	config: 25-landmark
	split: test
	metrics:
	- type: mean-radial-error
	value: 1.050
	name: MRE (mm)
	- type: sdr-2mm
	value: 87.8
	name: SDR@2mm (%)
	---

	# CephTrace v4 — Anatomy-Guided Cephalometric Landmark Detection

	1.050 mm MRE across 25 landmarks on a 151-image held-out test set, using image-adaptive spatial priors generated by anatomical analysis of each radiograph.

	## Model Description

	CephTrace v4 is a two-stage pipeline for automatic cephalometric landmark detection from lateral skull radiographs:

	- Stage 0 (Anatomical Initialization): A multi-phase module that detects the soft-tissue profile, partitions the image into anatomical zones, extracts bony contours, derives anchor landmarks via geometric rules, and generates 25 per-landmark Gaussian attention maps — all adapted to each patient's individual anatomy.
	- Stage 1 (Heatmap Regression): An HRNet-W32 backbone (32M params) that accepts the 28-channel input (3 RGB + 25 attention maps) and outputs 25 landmark heatmaps at 256×256 resolution.

	The key innovation is that the attention priors are image-adaptive: each patient receives maps centered at their estimated anatomy, not fixed population-average positions. Controlled experiments show this reduces MRE by 30.9% compared to the same architecture without priors.

	## ONNX Models

	All models are exported as ONNX (opset 14) for cross-platform inference.

	\| File \| Stage \| Purpose \| Size \| Input \| Output \|
	\|------\|-------\|---------\|------\|-------\|--------\|
	\| `v4_stage0_profile.onnx` \| 0A \| Soft-tissue profile segmentation \| 26.8 MB \| `(1,1,512,512)` float32 \| `(1,1,512,512)` sigmoid mask \|
	\| `z1_cranial_base_contours.onnx` \| 0C \| Cranial base contour segmentation \| 26.8 MB \| `(1,1,256,256)` float32 \| `(1,1,256,256)` logits \|
	\| `z2_midface_contours.onnx` \| 0C \| Midface contour segmentation (palatal + upper incisor) \| 26.8 MB \| `(1,1,256,256)` float32 \| `(1,2,256,256)` logits \|
	\| `z3_mandible_contours.onnx` \| 0C \| Mandible contour segmentation (border + symphysis + lower incisor) \| 26.8 MB \| `(1,1,256,256)` float32 \| `(1,3,256,256)` logits \|
	\| `z4_posterior_contours.onnx` \| 0C \| Posterior contour segmentation (mandible + cranial base) \| 26.8 MB \| `(1,1,256,256)` float32 \| `(1,2,256,256)` logits \|
	\| `phase0e_model.onnx` \| 0E \| Anchor → derived landmark MLP \| 455 KB \| `(1,14)` float32 \| `(1,36)` float32 \|
	\| `v4_stage1.onnx` \| 1 \| HRNet-W32 heatmap regression \| 130 MB \| `(1,28,512,512)` float32 \| `(1,25,256,256)` float32 \|

	Total: 264 MB

	## Pipeline Flow

	```
	Lateral Cephalogram (any resolution)
	│
	▼ resize to 512×512
	Phase 0A ──► Soft-tissue profile mask (Dice 0.80)
	│
	▼
	Phase 0B ──► 5 anatomical zones + 6 soft-tissue landmarks (geometric rules)
	│
	▼ per-zone CLAHE enhancement
	Phase 0C ──► Bony contour masks (4 zone-specific U-Nets)
	│
	▼ Douglas-Peucker simplification
	Phase 0D ──► 7 anchor landmarks (0.11 mm MRE, topological rules)
	│
	▼
	Phase 0E ──► 18 derived landmarks (MLP, 114K params)
	+ 25 Gaussian attention maps (256×256, 3-tier σ)
	│
	▼ bilinear upsample to 512, concat with RGB → 28 channels
	Stage 1 ──► 25 heatmaps (256×256) → peak decode → 25 landmarks
	```

	Inference time: ~410 ms total (Stage 0: ~40 ms, Stage 1: ~350 ms) on A100 GPU.

	## Landmark Set (25 landmarks, CANONICAL_25 order)

	```
	0: S (Sella) 1: N (Nasion) 2: Or (Orbitale)
	3: Po (Porion) 4: ANS 5: PNS
	6: A (Subspinale) 7: B (Supramentale) 8: Pog (Pogonion)
	9: Gn (Gnathion) 10: Me (Menton) 11: Go (Gonion)
	12: Ar (Articulare) 13: Co (Condylion) 14: U1_tip
	15: U1_root 16: L1_tip 17: L1_root
	18: UL (Upper Lip) 19: LL (Lower Lip) 20: Pm (Pterygomaxillare)
	21: Ba (Basion) 22: Pog_soft 23: Sn (Subnasale)
	24: Prn (Pronasale)
	```

	## Performance

	### Controlled Ablation (151-image held-out test set)

	\| Configuration \| Input \| MRE (mm) \| SDR@2mm \|
	\|---\|---\|---\|---\|
	\| HRNet backbone (no priors) \| 3-ch \| 1.520 \| 86.6% \|
	\| HRNet + Phase 0E priors \| 28-ch \| 1.050 \| 87.8% \|
	\| Improvement \| \| 0.470 (30.9%) \| +1.2% \|

	Same 1,201 training images, architecture, and recipe. Only variable: prior channels.

	### Prior Ablation

	\| Configuration \| MRE (mm) \| vs. No Priors \|
	\|---\|---\|---\|
	\| Random priors (shuffled channels) \| 2.240 \| +15.6% worse \|
	\| No priors (baseline) \| 1.938 \| — \|
	\| Fixed textbook priors \| 1.869 \| −3.6% (marginal) \|
	\| Image-adaptive priors (Phase 0E) \| 1.043 \| −46.2% \|

	### Attention Map Confidence Tiers

	\| Tier \| σ (at 256×256) \| Landmarks \| Mean Improvement \|
	\|---\|---\|---\|---\|
	\| High \| 5–7 \| S, N, Me, ANS, Prn, Sn \| −0.74 mm \|
	\| Medium \| 8–13 \| Go, Gn, Pog, Or, UL, LL, Pog', A \| −0.44 mm \|
	\| Low \| 18–22 \| Po, Co, B, PNS, U1r, L1r, Ba, Pm \| −0.17 mm \|

	### Clinical Reliability

	- Vertical skeletal classification (FMA): Cohen's κ = 0.78 (substantial agreement)
	- 20/25 landmarks improve with priors; 1 degrades (Basion, lowest confidence tier)

	## Usage

	```python
	import onnxruntime as ort
	import numpy as np
	import cv2

	# Load Stage 1 model
	sess = ort.InferenceSession("v4_stage1.onnx")

	# Prepare input (28 channels: 3 RGB + 25 attention maps from Stage 0)
	image = cv2.imread("cephalogram.jpg")
	image_512 = cv2.resize(image, (512, 512))
	rgb = image_512.astype(np.float32) / 255.0 # (512, 512, 3)
	rgb = np.transpose(rgb, (2, 0, 1)) # (3, 512, 512)

	# attention_maps shape: (25, 512, 512) from Stage 0 pipeline
	# (See Stage 0 inference code for generating these)
	input_28ch = np.concatenate([rgb, attention_maps], axis=0) # (28, 512, 512)
	input_tensor = input_28ch[np.newaxis] # (1, 28, 512, 512)

	# Run inference
	input_name = sess.get_inputs()[0].name
	heatmaps = sess.run(None, {input_name: input_tensor})[0] # (1, 25, 256, 256)

	# Decode landmarks from heatmap peaks
	landmarks = []
	for i in range(25):
	hm = heatmaps[0, i]
	y, x = np.unravel_index(np.argmax(hm), hm.shape)
	# Scale from heatmap (256) to image (512) coordinates
	landmarks.append((x * 2, y * 2))
	```

	## Training Data

	Aggregated from three public sources (1,502 total images):

	\| Source \| Images \| Landmarks \| Scanner(s) \|
	\|---\|---\|---\|---\|
	\| [ISBI 2015](https://www-o.ntust.edu.tw/~cweiwang/ISBI2015/challenge1/) \| 400 \| 19 \| Soredex CRANEX \|
	\| [Aariz/CEPHA29](https://doi.org/10.1038/s41597-025-05542-3) \| 1,000 \| 29 \| 7+ device types \|
	\| DentalCepha \| 102 \| 19 \| Mixed \|

	Split: 1,201 train / 150 validation / 151 test (stratified by source, seed=42).

	## Citation

	```bibtex
	@article{mohapatra2025cephtrace,
	title={CephTrace: Anatomy-Guided Spatial Attention Priors for
	Sub-Millimeter Cephalometric Landmark Detection},
	author={Mohapatra, Sidhartha and Mohanty, Pallavi},
	journal={arXiv preprint arXiv:2605.03358},
	year={2025},
	url={https://arxiv.org/abs/2605.03358}
	}
	```

	## Links

	\| Resource \| URL \|
	\|---\|---\|
	\| Paper \| [arXiv:2605.03358](https://arxiv.org/abs/2605.03358) \|
	\| Code \| [github.com/sidwiz/cephtrace-research](https://github.com/sidwiz/cephtrace-research) \|
	\| Data & Weights \| [Zenodo DOI 10.5281/zenodo.20032162](https://doi.org/10.5281/zenodo.20032162) \|
	\| Website \| [cephtrace.com](https://cephtrace.com) \|

	## License

	This work is licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Commercial use requires a separate license — contact research@cephtrace.com.

	Three U.S. provisional patent applications are pending (#64/037,246; #64/037,252; #64/039,042).

	## Limitations

	- Trained on 2D lateral cephalograms only; not validated on 3D CBCT or PA cephalograms.
	- Phase 0A requires visible soft-tissue profile; severely overexposed or cropped images may degrade.
	- Basion (Ba) accuracy degrades slightly with priors due to low Phase 0E confidence (σ=22).
	- Cross-source generalization without priors is poor (22–37 mm MRE in LOSO experiments); Phase 0's anatomical analysis provides scanner-invariant features.