afMLevel-background-unet

This U‑Net model predicts tilt, z scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps. It outputs a background image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying afMLevel code) to produce a levelled height map.

Note: afMLevel includes a second model, afMLevel-mask-unet, that predicts a feature map that can be used in traditional auto-levelling routines where line and plane fits are applied to the background (unmasked, featureless regions) rather than directly generating the noise background. Both models will be described in the accompanying paper and the mask model is found here: https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet

Model Details

This model is part of the afMLevel project.

Model Description

This model is a 7‑layer U‑Net architecture implemented in PyTorch, trained to perform image‑to‑image regression for background prediction in AFM height maps. The network was trained on 256 × 256‑pixel images and therefore expects inputs of this size at inference time.

The afMLevel repository includes tools for:

image preprocessing and tiling,
running inference,
generating noise background,
noise background subtraction,
integrating the model into batch-processing pipelines for HS-AFM videos and image sets.

Model Card Information

Developed by: Maya Tekchandani
Maintained by: Dr Daniel E. Rollins, Dr George R. Heath
Principal Investigator: Dr George R. Heath
Affiliation: Department of Physics and Astronomy, University of Leeds, UK
Funded by:
- Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the Biotechnology and Biological Sciences Research Council.
- Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1.
Shared by: Heath-AFM-Lab
Model type: U‑Net regression model for AFM background estimation
License: BSD‑3‑Clause
Finetuned from model: None (trained from scratch)

Model Sources

Repository: https://github.com/mayatek1/afMLevel
Paper: Tekchandani et al., AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies (in preparation, 2026)
Demo: Demonstration notebooks

Uses

This model is designed for use within the afMLevel background_model module.

Direct Use

The afMLevel inference code operates on NumPy arrays, so raw AFM files must first be loaded using an external reader such as playnano, AFMReader, or a custom loader. Once loaded, the afMLevel package and notebooks handle inference and output of either the predicted background or the levelled image (with the predicted background subtracted) directly.

The model has been primarily tested on biological AFM data (membranes, proteins, DNA origami, lattices, fibres) and is best suited to that context, though it may generalise to other sample types with similar imaging characteristics.

Downstream Use

Integration into playnano, enabling end-to-end reading and levelling of high-speed AFM (HS-AFM) movies; the afMLevel package works as a plugin for playnano for easy integration (use processing step level_ml_bg).

Out‑of‑Scope Use

This model is not intended for:

prediction of physical or mechanical properties,
denoising heavily corrupted AFM scans outside the training distribution,
interpretation of AFM contact mechanics,
specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
non‑biological samples without performance verification.

Bias, Risks, and Limitations

The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
The model may occasionally identify horizontal sample features as part of the background, causing them to be subtracted from the levelled image; users should inspect predicted backgrounds carefully.
Users should visually inspect a subset of the predicted backgrounds to ensure sample features are not present, which can affect local pixel height values in the levelled image.
The levelled outputs should also be inspected visually before scientific interpretation.

Recommendations

Manually verify a subset of predicted backgrounds and levelled images.
Avoid applying the model to imaging modes it was not trained on without validation.

How to Get Started with the Model

Use the model through the afMLevel repository, which handles inference, noise background generation and levelling. Demonstration notebooks are available within the GitHub repository.

Inference Speed

Per-frame processing time scales with image resolution in discrete steps rather than continuously: processing time is dominated by the inference step, and the number of inference steps increases as the image resolution crosses 256-pixel thresholds and additional tiles are generated. For a range of resolutions that map to the same number of tiles, per-frame processing time remains approximately constant, producing plateaus. The background model is slower than classical non-ML levelling routines but is not prohibitive for research pipelines at typical AFM and HS-AFM resolutions. Full timing benchmarks are provided in the accompanying paper (in preparation, check the GitHub for updates).

Training Details

The model was trained from scratch on real AFM topography data using the PyTorch framework.

Training Data

This model was trained on a dataset of 2,001 real AFM height‑map images from several AFM labs, spanning a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK, Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes).

To increase dataset size and improve generalization, images were augmented using:

reflection along the y‑axis,
rotation by 180°.

This produced 6,003 training images. An 80:20 train‑validation split was used.

Training Procedure

Architecture: 7‑layer U‑Net with large convolutional filters (9×9)
Framework: PyTorch
Optimizer: Adam
Learning rate: 0.0005
Objective: pixel‑wise continuous regression
Activation function: ReLU
Batch normalisation: applied after each convolutional layer
Dropout: none (p = 0.0)
Batch size: 32
Hardware: trained with GPU acceleration
Training images: 6,003
Train/validation split: 80:20 (random)
Loss function: Mean Squared Error (MSE)
Loss‑curve diagnostics were used to monitor convergence.

Preprocessing

Input images were preprocessed identically to the training data:

An initial 1st-order plane fit applied in x and y.
Min-max normalisation to [0, 1].
Images larger than 256 × 256 are pixel-split into 256 × 256 tiles for inference.

For the background model, a pixel-split method was employed for tiling to preserve all pixel values. Multiple 256 × 256 images were generated by taking alternating pixels.

Training Hyperparameters

Training regime: fp32

Speeds, Sizes, Times

Model file size: 982 MiB
Training epochs: 59

Evaluation

The performance of the ML‑generated predicted backgrounds was evaluated indirectly through their impact on levelling. The core quantitative metric used for assessment was the Mean Squared Error (MSE) between the afMLevel predicted-background-subtracted output and a manually levelled ground‑truth image. The results were also assessed visually by the developers.

Full details are in the paper: in preparation (check the GitHub for updates).

Testing Data & Metrics

Testing Data

Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:

biological sample types,
imaging conditions,
noise levels,
numbers of surface planes,
scan artefacts (e.g., streaks, line noise).

Metrics

Primary metric: MSE between auto‑levelled and manually levelled images.
Distribution analysis: mean vs. median MSE; a large difference between the two indicates that failed levelling produces pronounced artefacts.
Success‑rate: proportion of images below an MSE threshold of 0.1, selected as a conservative boundary for "well‑levelled" outputs.
Visual inspection score: percentage of images judged well-levelled by developer inspection, used as a complementary subjective metric.

Results

Initial internal testing indicates that the ML‑generated predicted backgrounds enable reliable automated levelling across a broad range of AFM images, including scans with varied noise levels and multiple height planes. Quantitative results and statistical analyses will be provided in the accompanying paper (in preparation, check the GitHub for updates).

Citation

Tekchandani et al., AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies (in preparation, 2026)

Model Card Authors

Maya Tekchandani
Dr Daniel E. Rollins
Dr George R. Heath

Contact

For questions or issues, please open a GitHub issue at https://github.com/mayatek1/afMLevel or contact:

George R. Heath - University of Leeds

Email: G.R.Heath@leeds.ac.uk

Downloads last month: -; Downloads are not tracked for this model. How to track