File size: 2,919 Bytes
4bee0a6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: cc-by-nc-nd-4.0
library_name: pytorch
tags:
  - medical-imaging
  - 3d-cnn
  - ultrasound
  - focused-ultrasound
  - transcranial-ultrasound
  - reproduction
datasets:
  - vinkle-srivastav/TFUScapes
language:
  - en
---

# DeepTFUS: base (run-1 reproduction)

*A reproduction attempt of DeepTFUS, proposed by [Srivastav et al. (arXiv:2505.12998)](https://arxiv.org/abs/2505.12998).*

This is the from-scratch baseline: 50 epochs on the paper recipe
(weighted-MSE + λ·gradient-L1, no focal-position aux), `base_width=16`
(3.4 M params), `pure-bf16`, `batch=4` at 256³ resolution. Given a 3D
head CT and a transducer placement, predicts the resulting in-skull
pressure field in <1 s on an H100 (≈ 50× faster than the k-Wave
physics simulator the dataset was generated from).

⭐ Partial reproduction: matched paper on `relative_l2`, did not match
on `focal_position_error_mm` (~2× worse) or `max_pressure_error`. This
gap motivated the 5 fine-tune variants in this model collection.

## Test results (n = 597 held-out CT × placement combinations)

| metric | paper | base (this model) | reproduced? |
|---|---:|---:|---|
| `relative_l2` mean ± std | 0.414 ± 0.086 | **0.384 ± 0.078** | ✅ Yes (slightly beats paper) |
| `relative_l2` median | 0.394 | **0.369** | ✅ |
| `focal_position_error_mm` mean ± std | 2.89 ± 2.14 | 6.49 ± 4.58 | ❌ No (~2.25× worse mean) |
| `focal_position_error_mm` median | 2.45 | 5.15 | ❌ |
| `max_pressure_error` mean ± std | 0.199 ± 0.158 | 0.225 ± 0.116 | ✅ Yes (within paper's std) |
| `max_pressure_error` median | 0.166 | 0.217 | (slightly above paper) |
| `focal_pressure_error` median | : | 0.528 | : |
| `focal_iou_fwhm` median | : | 0.143 | : |
| `inference_latency_s` (b=1, H100) | 11.4 (RTX 4090) | 0.233 | 49× faster (different HW) |

## Other variants and discussion

See the [Collection](https://huggingface.co/collections/masonwang025/deeptfus-reproduction-6a03e39286a09470b960511f)
for the 5 fine-tune variants built from this base ckpt, and the
[project page](https://masonjwang.com/projects/reproducing-deeptfus)
for the full reproduction story, interactive viewer, and discussion of
trade-offs.

## Usage

```python
from huggingface_hub import hf_hub_download
import torch

ckpt = torch.load(
    hf_hub_download("masonwang025/deeptfus-base", "ckpt_best.pt"),
    map_location="cpu", weights_only=False,
)
# ckpt['model']  : state_dict for the model defined in masonwang025/deeptfus repo
# ckpt['config'] : training config (architecture knobs + train hyperparams)
# ckpt['epoch']  : 43 (best by val_rel_l2)
```

Model code: [github.com/masonwang025/deeptfus](https://github.com/masonwang025/deeptfus).

## Citation & License

Paper: Srivastav et al., [arXiv:2505.12998](https://arxiv.org/abs/2505.12998), 2025.

License: [CC-BY-NC-ND-4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/),
matching the TFUScapes dataset license.