| --- |
| license: bsd-3-clause |
| pipeline_tag: image-to-image |
| tags: |
| - AFM |
| - physics |
| - biology |
| - atomic-force-microscopy |
| - microscopy |
| - image-processing |
| - unet |
| - surface-analysis |
| - chemistry |
| - nanoscience |
| --- |
| # afMLevel-background-unet |
|
|
| This U‑Net model predicts tilt, z scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps. |
| It outputs a **background** image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel) |
| code) to produce a levelled height map. |
|
|
| > **Note:** afMLevel includes a second model, **afMLevel-mask-unet**, |
| > that predicts a feature map that can be used in traditional auto-levelling routines where line and plane |
| > fits are applied to the background (unmasked, featureless regions) rather than directly |
| > generating the noise background. Both models will be described in the accompanying paper and the mask model is found |
| > here: [https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet](https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet) |
|
|
| ## Model Details |
|
|
| This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project. |
|
|
| ### Model Description |
|
|
| This model is a 7‑layer **U‑Net** architecture implemented in **PyTorch**, trained to perform image‑to‑image regression for background prediction in AFM height maps. The network was trained on **256 × 256‑pixel images** and therefore expects inputs of this size at inference time. |
|
|
| The afMLevel repository includes tools for: |
|
|
| - image preprocessing and tiling, |
| - running inference, |
| - generating noise background, |
| - noise background subtraction, |
| - integrating the model into batch-processing pipelines for HS-AFM videos and image sets. |
|
|
| ### Model Card Information |
|
|
| - **Developed by:** Maya Tekchandani |
| - **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath |
| - **Principal Investigator:** Dr George R. Heath |
| - **Affiliation:** Department of Physics and Astronomy, University of Leeds, UK |
| - **Funded by:** |
| - Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the Biotechnology and Biological Sciences Research Council. |
| - Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1. |
| - **Shared by:** [Heath-AFM-Lab](https://heath-afm-lab.github.io/) |
| - **Model type:** U‑Net regression model for AFM background estimation |
| - **License:** BSD‑3‑Clause |
| - **Finetuned from model:** None (trained from scratch) |
|
|
| ### Model Sources |
|
|
| - **Repository:** https://github.com/mayatek1/afMLevel |
| - **Paper:** Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026) |
| - **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks) |
|
|
| ## Uses |
|
|
| This model is designed for use within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module. |
|
|
| ### Direct Use |
|
|
| The [afMLevel](https://github.com/mayatek1/afMLevel/) inference code operates on **NumPy arrays**, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, the afMLevel package and notebooks handle inference and output of either the predicted background or the levelled image (with the predicted background subtracted) directly. |
|
|
| The model has been primarily tested on **biological AFM data** (membranes, proteins, DNA origami, lattices, fibres) and is best suited to that context, though it may generalise to other sample types with similar imaging characteristics. |
|
|
| ### Downstream Use |
|
|
| - Integration into [playnano](https://github.com/derollins/playNano), enabling end-to-end reading and levelling of **high-speed AFM (HS-AFM) movies**; the `afMLevel` package works as a plugin for `playnano` for easy integration (use processing step `level_ml_bg`). |
|
|
| ### Out‑of‑Scope Use |
|
|
| This model is **not** intended for: |
|
|
| - prediction of physical or mechanical properties, |
| - denoising heavily corrupted AFM scans outside the training distribution, |
| - interpretation of AFM contact mechanics, |
| - specialised AFM modes (KPFM, MFM, FMM, etc.) without validation, |
| - non‑biological samples without performance verification. |
|
|
| ## Bias, Risks, and Limitations |
|
|
| - The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials. |
| - Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions. |
| - The model may occasionally identify horizontal sample features as part of the background, causing them to be subtracted from the levelled image; users should inspect predicted backgrounds carefully. |
| - Users should visually inspect a subset of the predicted backgrounds to ensure sample features are not present, which can affect local pixel height values in the levelled image. |
| - The levelled outputs should also be inspected visually before scientific interpretation. |
|
|
| ### Recommendations |
|
|
| - Manually verify a subset of predicted backgrounds and levelled images. |
| - Avoid applying the model to imaging modes it was not trained on without validation. |
|
|
| ## How to Get Started with the Model |
|
|
| Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles inference, noise background generation and levelling. Demonstration notebooks are available [within the GitHub repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks). |
|
|
| ## Inference Speed |
|
|
| Per-frame processing time scales with image resolution in discrete steps rather than continuously: processing time is dominated by the inference step, and the number of inference steps increases as the image resolution crosses 256-pixel thresholds and additional tiles are generated. For a range of resolutions that map to the same number of tiles, per-frame processing time remains approximately constant, producing plateaus. The background model is slower than classical non-ML levelling routines but is not prohibitive for research pipelines at typical AFM and HS-AFM resolutions. Full timing benchmarks are provided in the accompanying paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates). |
|
|
| ## Training Details |
|
|
| The model was trained from scratch on real AFM topography data using the PyTorch framework. |
|
|
| ### Training Data |
|
|
| This model was trained on a **dataset of 2,001 real AFM height‑map images** from several AFM labs, spanning a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK, Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes). |
|
|
| To increase dataset size and improve generalization, images were augmented using: |
|
|
| - reflection along the y‑axis, |
| - rotation by 180°. |
|
|
| This produced **6,003 training images**. An **80:20 train‑validation split** was used. |
|
|
| ### Training Procedure |
|
|
| - **Architecture:** 7‑layer U‑Net with large convolutional filters (9×9) |
| - **Framework:** PyTorch |
| - **Optimizer:** Adam |
| - **Learning rate:** 0.0005 |
| - **Objective:** pixel‑wise continuous regression |
| - **Activation function:** ReLU |
| - **Batch normalisation:** applied after each convolutional layer |
| - **Dropout:** none (p = 0.0) |
| - **Batch size:** 32 |
| - **Hardware:** trained with GPU acceleration |
| - **Training images:** 6,003 |
| - **Train/validation split:** 80:20 (random) |
| - **Loss function:** Mean Squared Error (MSE) |
| - Loss‑curve diagnostics were used to monitor convergence. |
|
|
| #### Preprocessing |
|
|
| Input images were preprocessed identically to the training data: |
|
|
| 1. An initial 1st-order plane fit applied in x and y. |
| 2. Min-max normalisation to [0, 1]. |
| 3. Images larger than 256 × 256 are pixel-split into 256 × 256 tiles for inference. |
|
|
| For the **background model**, a pixel-split method was employed for tiling to preserve all pixel values. Multiple 256 × 256 images were generated by taking alternating pixels. |
|
|
| #### Training Hyperparameters |
|
|
| - **Training regime:** fp32 |
|
|
| #### Speeds, Sizes, Times |
|
|
| - **Model file size:** 982 MiB |
| - **Training epochs:** 59 |
|
|
| ## Evaluation |
|
|
| The performance of the ML‑generated predicted backgrounds was evaluated indirectly through their impact on levelling. |
| The core quantitative metric used for assessment was the **Mean Squared Error** (MSE) between the afMLevel predicted-background-subtracted |
| output and a manually levelled ground‑truth image. The results were also assessed visually by the developers. |
|
|
| Full details are in the paper: in preparation (check the [GitHub](https://github.com/mayatek1/afMLevel) for updates). |
|
|
| ### Testing Data & Metrics |
|
|
| #### Testing Data |
|
|
| Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of: |
|
|
| - biological sample types, |
| - imaging conditions, |
| - noise levels, |
| - numbers of surface planes, |
| - scan artefacts (e.g., streaks, line noise). |
|
|
| #### Metrics |
|
|
| - **Primary metric**: MSE between auto‑levelled and manually levelled images. |
| - **Distribution analysis:** mean vs. median MSE; a large difference between the two indicates that failed levelling produces pronounced artefacts. |
| - **Success‑rate**: proportion of images below an MSE threshold of 0.1, selected as a conservative boundary for "well‑levelled" outputs. |
| - **Visual inspection score:** percentage of images judged well-levelled by developer inspection, used as a complementary subjective metric. |
|
|
| ### Results |
|
|
| Initial internal testing indicates that the ML‑generated predicted backgrounds enable reliable automated levelling across a broad range of AFM images, |
| including scans with varied noise levels and multiple height planes. Quantitative results and statistical analyses will be provided in the accompanying |
| paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates). |
|
|
| ## Citation |
|
|
| Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026) |
|
|
| ## Model Card Authors |
|
|
| - **Maya Tekchandani** |
| - **Dr Daniel E. Rollins** |
| - **Dr George R. Heath** |
|
|
| ## Contact |
|
|
| For questions or issues, please open a GitHub issue at https://github.com/mayatek1/afMLevel or contact: |
|
|
| **George R. Heath - University of Leeds** |
|
|
| Email: G.R.Heath@leeds.ac.uk |