derollins commited on
Commit
a59724d
·
verified ·
1 Parent(s): ff669bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -47
README.md CHANGED
@@ -15,10 +15,16 @@ tags:
15
  ---
16
  # afMLevel-background-unet
17
 
18
- This U‑Net model predicts tilt, scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps.
19
  It outputs a **background** image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel)
20
  code) to produce a levelled height map.
21
 
 
 
 
 
 
 
22
  ## Model Details
23
 
24
  This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.
@@ -29,16 +35,22 @@ This model is a 7‑layer **U‑Net** architecture implemented in **PyTorch**, t
29
 
30
  The afMLevel repository includes tools for:
31
 
 
32
  - running inference,
33
- - subtracting the predicted background,
34
- - integrating the model into AFM anasis workflows.
35
- ---
 
 
 
36
  - **Developed by:** Maya Tekchandani
37
  - **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath
38
  - **Principal Investigator:** Dr George R. Heath
39
- - **Affiliation:** University of Leeds
40
- - **Funded by [optional]:** [More Information Needed]
41
- - **Shared by:** [Heath-AFMab](https://heath-afm-lab.github.io/)
 
 
42
  - **Model type:** U‑Net regression model for AFM background estimation
43
  - **License:** BSD‑3‑Clause
44
  - **Finetuned from model:** None (trained from scratch)
@@ -46,31 +58,29 @@ The afMLevel repository includes tools for:
46
  ### Model Sources
47
 
48
  - **Repository:** https://github.com/mayatek1/afMLevel
49
- - **Paper:** In preparation
50
  - **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)
51
 
52
  ## Uses
53
 
54
- This model is designed for used within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module.
55
 
56
  ### Direct Use
57
 
58
- The [afMLevel](https://github.com/mayatek1/afMLevel/) model aplication package operates on **NumPy arrays**, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, afMLevel handles inference and outputs either the predicted background or the final levelled image.
59
 
60
- The model has been primarily tested on **biological AFM data**. It may generalise to other sample types with similar imaging characteristics.
61
 
62
  ### Downstream Use
63
 
64
- - Integration into [playnano](https://github.com/derollins/playNano), enabling endtoend reading and levelling of **highspeed AFM videos**,
65
- - The `afMLevel` package works as plugin for `playnano` for easy integration,
66
- - Preprocessing for segmentation, particle detection, or other AFM analysis tools.
67
 
68
  ### Out‑of‑Scope Use
69
 
70
  This model is **not** intended for:
71
 
72
  - prediction of physical or mechanical properties,
73
- - denoising heavily corrupted AFM scans outside the training distrution,
74
  - interpretation of AFM contact mechanics,
75
  - specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
76
  - non‑biological samples without performance verification.
@@ -79,32 +89,37 @@ This model is **not** intended for:
79
 
80
  - The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
81
  - Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
82
- - Users should visually inspect levelled outputs before scientific interpretation.
 
 
83
 
84
  ### Recommendations
85
 
86
- - Manually verify a subset of levelled images.
87
- - Avoid applying the model to imaging modes it was not trained on.
88
 
89
  ## How to Get Started with the Model
90
 
91
- Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles background prediction, subtraction, and output generation. Demonstration notebooks are available
92
- [within the GitHUb repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).
 
 
 
93
 
94
  ## Training Details
95
 
96
- The model was trained from scratch on real AFM topography data using PyTorch.
97
 
98
  ### Training Data
99
 
100
- This model was trained on a **non‑public dataset of 2,001 real AFM height‑map images**.
 
101
  To increase dataset size and improve generalization, images were augmented using:
102
 
103
  - reflection along the y‑axis,
104
  - rotation by 180°.
105
 
106
- This produced **6,003 training images**.
107
- A **60:40 train‑validation split** was used.
108
 
109
  ### Training Procedure
110
 
@@ -113,56 +128,71 @@ A **60:40 train‑validation split** was used.
113
  - **Optimizer:** Adam
114
  - **Learning rate:** 0.0005
115
  - **Objective:** pixel‑wise continuous regression
 
 
 
 
116
  - **Hardware:** trained with GPU acceleration
117
- - Loss curves were monitored to assess convergence.
 
 
 
118
 
119
  #### Preprocessing
120
 
121
- [More Information Needed]
 
 
 
 
 
 
122
 
123
  #### Training Hyperparameters
124
 
125
- - **Training regime:** [More Information Needed]
126
 
127
  #### Speeds, Sizes, Times
128
 
129
- [More Information Needed]
 
130
 
131
  ## Evaluation
132
 
133
- The performance of the background model was evaluated indirectly through its impact on automated levelling. The main metric used was **Mean Squared Error (MSE)** between the auto‑levelled output and manually levelled ground‑truth images. Visual inspection was also carried out by the developers. Full evaluation results will be provided in the accompanying paper (in preparation).
 
 
 
 
134
 
135
- ### Testing Data
 
 
136
 
137
  Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:
138
 
139
- - biological sale types,
140
  - imaging conditions,
141
  - noise levels,
142
  - numbers of surface planes,
143
  - scan artefacts (e.g., streaks, line noise).
144
 
145
- *A dataset link will be added when appropriate.*
146
-
147
- ### Metrics
148
 
149
- - **Primary metric:** MSE between auto‑levelled and manually levelled images
150
- - **Distribution analysis:** comparing mean vs median MSE
151
- - **Success‑rate metric:** proportion of images with MSE < 0.1 (empirical well‑levelled threshold)
 
152
 
153
  ### Results
154
 
155
- Initial internal testing indicates that the background model supports reliable automated levelling across a broad range of AFM images. Full quantitative and statistical analyses will be included in the companion paper (in preparation).
 
 
156
 
157
  ## Citation
158
 
159
- Paper in praration
160
-
161
- **BibTeX:**
162
- [More Inrmation Needed]
163
-
164
- **APA:**
165
- [More Information Needed]
166
 
167
  ## Model Card Authors
168
 
@@ -172,6 +202,8 @@ Paper in praration
172
 
173
  ## Contact
174
 
175
- For questions or issues, please contact:
176
- **George R. Heath, University of Leeds**
 
 
177
  Email: G.R.Heath@leeds.ac.uk
 
15
  ---
16
  # afMLevel-background-unet
17
 
18
+ This U‑Net model predicts tilt, z scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps.
19
  It outputs a **background** image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel)
20
  code) to produce a levelled height map.
21
 
22
+ > **Note:** afMLevel includes a second model, **afMLevel-mask-unet**,
23
+ > that predicts a feature map that can be used in traditional auto-levelling routines where line and plane
24
+ > fits are applied to the background (unmasked, featureless regions) rather than directly
25
+ > generating the noise background. Both models will be described in the accompanying paper and the mask model is found
26
+ > here: [https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet](https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet)
27
+
28
  ## Model Details
29
 
30
  This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.
 
35
 
36
  The afMLevel repository includes tools for:
37
 
38
+ - image preprocessing and tiling,
39
  - running inference,
40
+ - generating noise background,
41
+ - noise background subtraction,
42
+ - integrating the model into batch-processing pipelines for HS-AFM videos and image sets.
43
+
44
+ ### Model Card Information
45
+
46
  - **Developed by:** Maya Tekchandani
47
  - **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath
48
  - **Principal Investigator:** Dr George R. Heath
49
+ - **Affiliation:** Department of Physics and Astronomy, University of Leeds, UK
50
+ - **Funded by:**
51
+ - Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the Biotechnology and Biological Sciences Research Council.
52
+ - Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1.
53
+ - **Shared by:** [Heath-AFM-Lab](https://heath-afm-lab.github.io/)
54
  - **Model type:** U‑Net regression model for AFM background estimation
55
  - **License:** BSD‑3‑Clause
56
  - **Finetuned from model:** None (trained from scratch)
 
58
  ### Model Sources
59
 
60
  - **Repository:** https://github.com/mayatek1/afMLevel
61
+ - **Paper:** Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
62
  - **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)
63
 
64
  ## Uses
65
 
66
+ This model is designed for use within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module.
67
 
68
  ### Direct Use
69
 
70
+ The [afMLevel](https://github.com/mayatek1/afMLevel/) inference code operates on **NumPy arrays**, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, the afMLevel package and notebooks handle inference and output of either the predicted background or the levelled image (with the predicted background subtracted) directly.
71
 
72
+ The model has been primarily tested on **biological AFM data** (membranes, proteins, DNA origami, lattices, fibres) and is best suited to that context, though it may generalise to other sample types with similar imaging characteristics.
73
 
74
  ### Downstream Use
75
 
76
+ - Integration into [playnano](https://github.com/derollins/playNano), enabling end-to-end reading and levelling of **high-speed AFM (HS-AFM) movies**; the `afMLevel` package works as a plugin for `playnano` for easy integration (use processing step `level_ml_bg`).
 
 
77
 
78
  ### Out‑of‑Scope Use
79
 
80
  This model is **not** intended for:
81
 
82
  - prediction of physical or mechanical properties,
83
+ - denoising heavily corrupted AFM scans outside the training distribution,
84
  - interpretation of AFM contact mechanics,
85
  - specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
86
  - non‑biological samples without performance verification.
 
89
 
90
  - The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
91
  - Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
92
+ - The model may occasionally identify horizontal sample features as part of the background, causing them to be subtracted from the levelled image; users should inspect predicted backgrounds carefully.
93
+ - Users should visually inspect a subset of the predicted backgrounds to ensure sample features are not present, which can affect local pixel height values in the levelled image.
94
+ - The levelled outputs should also be inspected visually before scientific interpretation.
95
 
96
  ### Recommendations
97
 
98
+ - Manually verify a subset of predicted backgrounds and levelled images.
99
+ - Avoid applying the model to imaging modes it was not trained on without validation.
100
 
101
  ## How to Get Started with the Model
102
 
103
+ Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles inference, noise background generation and levelling. Demonstration notebooks are available [within the GitHub repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).
104
+
105
+ ## Inference Speed
106
+
107
+ Per-frame processing time scales with image resolution in discrete steps rather than continuously: processing time is dominated by the inference step, and the number of inference steps increases as the image resolution crosses 256-pixel thresholds and additional tiles are generated. For a range of resolutions that map to the same number of tiles, per-frame processing time remains approximately constant, producing plateaus. The background model is slower than classical non-ML levelling routines but is not prohibitive for research pipelines at typical AFM and HS-AFM resolutions. Full timing benchmarks are provided in the accompanying paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
108
 
109
  ## Training Details
110
 
111
+ The model was trained from scratch on real AFM topography data using the PyTorch framework.
112
 
113
  ### Training Data
114
 
115
+ This model was trained on a **dataset of 2,001 real AFM height‑map images** from several AFM labs, spanning a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK, Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes).
116
+
117
  To increase dataset size and improve generalization, images were augmented using:
118
 
119
  - reflection along the y‑axis,
120
  - rotation by 180°.
121
 
122
+ This produced **6,003 training images**. An **80:20 train‑validation split** was used.
 
123
 
124
  ### Training Procedure
125
 
 
128
  - **Optimizer:** Adam
129
  - **Learning rate:** 0.0005
130
  - **Objective:** pixel‑wise continuous regression
131
+ - **Activation function:** ReLU
132
+ - **Batch normalisation:** applied after each convolutional layer
133
+ - **Dropout:** none (p = 0.0)
134
+ - **Batch size:** 32
135
  - **Hardware:** trained with GPU acceleration
136
+ - **Training images:** 6,003
137
+ - **Train/validation split:** 80:20 (random)
138
+ - **Loss function:** Mean Squared Error (MSE)
139
+ - Loss‑curve diagnostics were used to monitor convergence.
140
 
141
  #### Preprocessing
142
 
143
+ Input images were preprocessed identically to the training data:
144
+
145
+ 1. An initial 1st-order plane fit applied in x and y.
146
+ 2. Min-max normalisation to [0, 1].
147
+ 3. Images larger than 256 × 256 are pixel-split into 256 × 256 tiles for inference.
148
+
149
+ For the **background model**, a pixel-split method was employed for tiling to preserve all pixel values. Multiple 256 × 256 images were generated by taking alternating pixels.
150
 
151
  #### Training Hyperparameters
152
 
153
+ - **Training regime:** fp32
154
 
155
  #### Speeds, Sizes, Times
156
 
157
+ - **Model file size:** 982 MiB
158
+ - **Training epochs:** 59
159
 
160
  ## Evaluation
161
 
162
+ The performance of the ML‑generated predicted backgrounds was evaluated indirectly through their impact on levelling.
163
+ The core quantitative metric used for assessment was the **Mean Squared Error** (MSE) between the afMLevel predicted-background-subtracted
164
+ output and a manually levelled ground‑truth image. The results were also assessed visually by the developers.
165
+
166
+ Full details are in the paper: in preparation (check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
167
 
168
+ ### Testing Data & Metrics
169
+
170
+ #### Testing Data
171
 
172
  Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:
173
 
174
+ - biological sample types,
175
  - imaging conditions,
176
  - noise levels,
177
  - numbers of surface planes,
178
  - scan artefacts (e.g., streaks, line noise).
179
 
180
+ #### Metrics
 
 
181
 
182
+ - **Primary metric**: MSE between auto‑levelled and manually levelled images.
183
+ - **Distribution analysis:** mean vs. median MSE; a large difference between the two indicates that failed levelling produces pronounced artefacts.
184
+ - **Success‑rate**: proportion of images below an MSE threshold of 0.1, selected as a conservative boundary for "well‑levelled" outputs.
185
+ - **Visual inspection score:** percentage of images judged well-levelled by developer inspection, used as a complementary subjective metric.
186
 
187
  ### Results
188
 
189
+ Initial internal testing indicates that the ML‑generated predicted backgrounds enable reliable automated levelling across a broad range of AFM images,
190
+ including scans with varied noise levels and multiple height planes. Quantitative results and statistical analyses will be provided in the accompanying
191
+ paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
192
 
193
  ## Citation
194
 
195
+ Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
 
 
 
 
 
 
196
 
197
  ## Model Card Authors
198
 
 
202
 
203
  ## Contact
204
 
205
+ For questions or issues, please open a GitHub issue at https://github.com/mayatek1/afMLevel or contact:
206
+
207
+ **George R. Heath - University of Leeds**
208
+
209
  Email: G.R.Heath@leeds.ac.uk