Heath-AFM-Lab
/

afMLevel-background-unet

@@ -15,10 +15,16 @@ tags:
 ---
 # afMLevel-background-unet
-This U‑Net model predicts tilt, scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps.
 It outputs a **background** image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel)
 code) to produce a levelled height map.
 ## Model Details
 This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.
@@ -29,16 +35,22 @@ This model is a 7‑layer **U‑Net** architecture implemented in **PyTorch**, t
 The afMLevel repository includes tools for:
 - running inference,
-- subtracting the predicted background,
-- integrating the model into AFM anasis workflows.
----
 - **Developed by:** Maya Tekchandani
 - **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath
 - **Principal Investigator:** Dr George R. Heath
-- **Affiliation:** University of Leeds
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by:** [Heath-AFMab](https://heath-afm-lab.github.io/)
 - **Model type:** U‑Net regression model for AFM background estimation
 - **License:** BSD‑3‑Clause
 - **Finetuned from model:** None (trained from scratch)
@@ -46,31 +58,29 @@ The afMLevel repository includes tools for:
 ### Model Sources
 - **Repository:** https://github.com/mayatek1/afMLevel
-- **Paper:** In preparation
 - **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)
 ## Uses
-This model is designed for used within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module.
 ### Direct Use
-The [afMLevel](https://github.com/mayatek1/afMLevel/) model aplication package operates on **NumPy arrays**, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, afMLevel handles inference and outputs either the predicted background or the final levelled image.
-The model has been primarily tested on **biological AFM data**. It may generalise to other sample types with similar imaging characteristics.
 ### Downstream Use
-- Integration into [playnano](https://github.com/derollins/playNano), enabling end‑to‑end reading and levelling of **high‑speed AFM videos**,
-  - The `afMLevel` package works as plugin for `playnano` for easy integration,
-- Preprocessing for segmentation, particle detection, or other AFM analysis tools.
 ### Out‑of‑Scope Use
 This model is **not** intended for:
 - prediction of physical or mechanical properties,
-- denoising heavily corrupted AFM scans outside the training distrution,
 - interpretation of AFM contact mechanics,
 - specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
 - non‑biological samples without performance verification.
@@ -79,32 +89,37 @@ This model is **not** intended for:
 - The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
 - Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
-- Users should visually inspect levelled outputs before scientific interpretation.
 ### Recommendations
-- Manually verify a subset of levelled images.
-- Avoid applying the model to imaging modes it was not trained on.
 ## How to Get Started with the Model
-Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles background prediction, subtraction, and output generation. Demonstration notebooks are available
-[within the GitHUb repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).
 ## Training Details
-The model was trained from scratch on real AFM topography data using PyTorch.
 ### Training Data
-This model was trained on a **non‑public dataset of 2,001 real AFM height‑map images**.
 To increase dataset size and improve generalization, images were augmented using:
 - reflection along the y‑axis,
 - rotation by 180°.
-This produced **6,003 training images**.
-A **60:40 train‑validation split** was used.
 ### Training Procedure
@@ -113,56 +128,71 @@ A **60:40 train‑validation split** was used.
 - **Optimizer:** Adam
 - **Learning rate:** 0.0005
 - **Objective:** pixel‑wise continuous regression
 - **Hardware:** trained with GPU acceleration
-- Loss curves were monitored to assess convergence.
 #### Preprocessing
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed]
 #### Speeds, Sizes, Times
-[More Information Needed]
 ## Evaluation
-The performance of the background model was evaluated indirectly through its impact on automated levelling. The main metric used was **Mean Squared Error (MSE)** between the auto‑levelled output and manually levelled ground‑truth images. Visual inspection was also carried out by the developers. Full evaluation results will be provided in the accompanying paper (in preparation).
-### Testing Data
 Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:
-- biological sale types,
 - imaging conditions,
 - noise levels,
 - numbers of surface planes,
 - scan artefacts (e.g., streaks, line noise).
-*A dataset link will be added when appropriate.*
-### Metrics
-- **Primary metric:** MSE between auto‑levelled and manually levelled images
-- **Distribution analysis:** comparing mean vs median MSE
-- **Success‑rate metric:** proportion of images with MSE < 0.1 (empirical “well‑levelled” threshold)
 ### Results
-Initial internal testing indicates that the background model supports reliable automated levelling across a broad range of AFM images. Full quantitative and statistical analyses will be included in the companion paper (in preparation).
 ## Citation
-Paper in praration
-**BibTeX:**
-[More Inrmation Needed]
-**APA:**
-[More Information Needed]
 ## Model Card Authors
@@ -172,6 +202,8 @@ Paper in praration
 ## Contact
-For questions or issues, please contact:
-**George R. Heath, University of Leeds**
 Email: G.R.Heath@leeds.ac.uk

 ---
 # afMLevel-background-unet
+This U‑Net model predicts tilt, z scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps.
 It outputs a **background** image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel)
 code) to produce a levelled height map.
+> **Note:** afMLevel includes a second model, **afMLevel-mask-unet**,
+> that predicts a feature map that can be used in traditional auto-levelling routines where line and plane
+> fits are applied to the background (unmasked, featureless regions) rather than directly
+> generating the noise background. Both models will be described in the accompanying paper and the mask model is found
+> here: [https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet](https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet)
 ## Model Details
 This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.
 The afMLevel repository includes tools for:
+- image preprocessing and tiling,
 - running inference,
+- generating noise background,
+- noise background subtraction,
+- integrating the model into batch-processing pipelines for HS-AFM videos and image sets.
+### Model Card Information
 - **Developed by:** Maya Tekchandani
 - **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath
 - **Principal Investigator:** Dr George R. Heath
+- **Affiliation:** Department of Physics and Astronomy, University of Leeds, UK
+- **Funded by:**
+  - Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the Biotechnology and Biological Sciences Research Council.
+  - Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1.
+- **Shared by:** [Heath-AFM-Lab](https://heath-afm-lab.github.io/)
 - **Model type:** U‑Net regression model for AFM background estimation
 - **License:** BSD‑3‑Clause
 - **Finetuned from model:** None (trained from scratch)
 ### Model Sources
 - **Repository:** https://github.com/mayatek1/afMLevel
+- **Paper:** Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
 - **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)
 ## Uses
+This model is designed for use within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module.
 ### Direct Use
+The [afMLevel](https://github.com/mayatek1/afMLevel/) inference code operates on **NumPy arrays**, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, the afMLevel package and notebooks handle inference and output of either the predicted background or the levelled image (with the predicted background subtracted) directly.
+The model has been primarily tested on **biological AFM data** (membranes, proteins, DNA origami, lattices, fibres) and is best suited to that context, though it may generalise to other sample types with similar imaging characteristics.
 ### Downstream Use
+- Integration into [playnano](https://github.com/derollins/playNano), enabling end-to-end reading and levelling of **high-speed AFM (HS-AFM) movies**; the `afMLevel` package works as a plugin for `playnano` for easy integration (use processing step `level_ml_bg`).
 ### Out‑of‑Scope Use
 This model is **not** intended for:
 - prediction of physical or mechanical properties,
+- denoising heavily corrupted AFM scans outside the training distribution,
 - interpretation of AFM contact mechanics,
 - specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
 - non‑biological samples without performance verification.
 - The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
 - Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
+- The model may occasionally identify horizontal sample features as part of the background, causing them to be subtracted from the levelled image; users should inspect predicted backgrounds carefully.
+- Users should visually inspect a subset of the predicted backgrounds to ensure sample features are not present, which can affect local pixel height values in the levelled image.
+- The levelled outputs should also be inspected visually before scientific interpretation.
 ### Recommendations
+- Manually verify a subset of predicted backgrounds and levelled images.
+- Avoid applying the model to imaging modes it was not trained on without validation.
 ## How to Get Started with the Model
+Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles inference, noise background generation and levelling. Demonstration notebooks are available [within the GitHub repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).
+## Inference Speed
+Per-frame processing time scales with image resolution in discrete steps rather than continuously: processing time is dominated by the inference step, and the number of inference steps increases as the image resolution crosses 256-pixel thresholds and additional tiles are generated. For a range of resolutions that map to the same number of tiles, per-frame processing time remains approximately constant, producing plateaus. The background model is slower than classical non-ML levelling routines but is not prohibitive for research pipelines at typical AFM and HS-AFM resolutions. Full timing benchmarks are provided in the accompanying paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
 ## Training Details
+The model was trained from scratch on real AFM topography data using the PyTorch framework.
 ### Training Data
+This model was trained on a **dataset of 2,001 real AFM height‑map images** from several AFM labs, spanning a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK, Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes).
 To increase dataset size and improve generalization, images were augmented using:
 - reflection along the y‑axis,
 - rotation by 180°.
+This produced **6,003 training images**. An **80:20 train‑validation split** was used.
 ### Training Procedure
 - **Optimizer:** Adam
 - **Learning rate:** 0.0005
 - **Objective:** pixel‑wise continuous regression
+- **Activation function:** ReLU
+- **Batch normalisation:** applied after each convolutional layer
+- **Dropout:** none (p = 0.0)
+- **Batch size:** 32
 - **Hardware:** trained with GPU acceleration
+- **Training images:** 6,003
+- **Train/validation split:** 80:20 (random)
+- **Loss function:** Mean Squared Error (MSE)
+- Loss‑curve diagnostics were used to monitor convergence.
 #### Preprocessing
+Input images were preprocessed identically to the training data:
+1. An initial 1st-order plane fit applied in x and y.
+2. Min-max normalisation to [0, 1].
+3. Images larger than 256 × 256 are pixel-split into 256 × 256 tiles for inference.
+For the **background model**, a pixel-split method was employed for tiling to preserve all pixel values. Multiple 256 × 256 images were generated by taking alternating pixels.
 #### Training Hyperparameters
+- **Training regime:** fp32
 #### Speeds, Sizes, Times
+- **Model file size:** 982 MiB
+- **Training epochs:** 59
 ## Evaluation
+The performance of the ML‑generated predicted backgrounds was evaluated indirectly through their impact on levelling.
+The core quantitative metric used for assessment was the **Mean Squared Error** (MSE) between the afMLevel predicted-background-subtracted
+output and a manually levelled ground‑truth image. The results were also assessed visually by the developers.
+Full details are in the paper: in preparation (check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
+### Testing Data & Metrics
+#### Testing Data
 Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:
+- biological sample types,
 - imaging conditions,
 - noise levels,
 - numbers of surface planes,
 - scan artefacts (e.g., streaks, line noise).
+#### Metrics
+- **Primary metric**: MSE between auto‑levelled and manually levelled images.
+- **Distribution analysis:** mean vs. median MSE; a large difference between the two indicates that failed levelling produces pronounced artefacts.
+- **Success‑rate**: proportion of images below an MSE threshold of 0.1, selected as a conservative boundary for "well‑levelled" outputs.
+- **Visual inspection score:** percentage of images judged well-levelled by developer inspection, used as a complementary subjective metric.
 ### Results
+Initial internal testing indicates that the ML‑generated predicted backgrounds enable reliable automated levelling across a broad range of AFM images,
+including scans with varied noise levels and multiple height planes. Quantitative results and statistical analyses will be provided in the accompanying
+paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
 ## Citation
+Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
 ## Model Card Authors
 ## Contact
+For questions or issues, please open a GitHub issue at https://github.com/mayatek1/afMLevel or contact:
+**George R. Heath - University of Leeds**
 Email: G.R.Heath@leeds.ac.uk