afMLevel-mask-unet / README.md
derollins's picture
Update README.md
5bdd639 verified
---
license: bsd-3-clause
pipeline_tag: image-segmentation
tags:
- AFM
- physics
- biology
- atomic-force-microscopy
- microscopy
- image-processing
- unet
- surface-analysis
- chemistry
- nanoscience
---
# afMLevel-mask-unet
This U‑Net model masks features in Atomic Force Microscopy (AFM) height maps.
It outputs a **probability mask** image, the same size as the raw AFM image; the accompanying python package, [afMLevel](https://github.com/mayatek1/afMLevel)
code then applies a threshold (typically 0.5) to produce a **binary mask**. This mask can then be used in automated
levelling routines as implemented in the `level_mask_ml()` function in afMLevel, that output levelled images.
> **Note:** afMLevel includes a second model, **afMLevel-background-unet**,
> that predicts the noise background , without requiring a masking step. Both
> models will be described in the accompanying paper and the background model is found
> here: [https://huggingface.co/Heath-AFM-Lab/afMLevel-background-unet](https://huggingface.co/Heath-AFM-Lab/afMLevel-background-unet)
## Model Details
This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.
### Model Description
The **afMLevel** mask model is a 7‑layer **U‑Net** architecture implemented in **PyTorch**, designed to segment
features in AFM height map images so that they can be excluded during traditional levelling routines (plane fits and median line
subtraction). The network was trained on **256 × 256‑pixel** inputs, and therefore expects images of this size at inference time.
In order to apply the model and use it to level AFM images the **afMLevel** repository includes tools for:
- image preprocessing and resizing,
- running inference,
- generating binary masks,
- applying the masks in levelling routines,
- integrating the model into batch-processing pipelines for HS-AFM videos and image sets.
### Model Card Information
- **Developed by:** Maya Tekchandani
- **Maintained by:** Dr Daniel E. Rollins, Dr George R. Heath
- **Principal Investigator:** Dr George R. Heath
- **Affiliation:** Department of Physics and Astronomy, University of Leeds, UK
- **Funded by:**
- Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the
Biotechnology and Biological Sciences Research Council.
- Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1.
- **Shared by:** [Heath-AFM-Lab](https://heath-afm-lab.github.io/)
- **Model type:** U‑Net segmentation model for AFM image feature masking
- **License:** BSD‑3‑Clause
- **Finetuned from model:** None (trained from scratch)
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/mayatek1/afMLevel
- **Paper:** Tekchandani et al., *AFMLevel: Deep learning U-Net Models for
levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
- **Demo:** [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)
## Uses
This model is designed for use within the [afMLevel](https://github.com/mayatek1/afMLevel/) `mask_model` module.
### Direct Use
The [afMLevel](https://github.com/mayatek1/afMLevel/) inference code operates on **NumPy arrays**, so raw AFM files must first be
loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom
loader. Once loaded, the afMLevel package and notebooks handle inference and output of either
the predicted mask or the levelled image directly.
The model has been primarily tested on **biological AFM data** (membranes, proteins, DNA origami,
lattices, fibres) and is best suited to that context, though it may generalise to other sample types
with similar imaging characteristics.
### Downstream Use
- Integration into [playnano](https://github.com/derollins/playNano), enabling
end-to-end reading and levelling of **high-speed AFM (HS-AFM) movies**;
the `afMLevel` package works as a plugin for `playnano` for easy integration (use processing step `level_ml_mask`).
- Preprocessing for segmentation, particle detection, or other AFM analysis tools.
### Out‑of‑Scope Use
This model is **not** intended for:
- predicting physical or mechanical properties of samples,
- interpreting AFM contact mechanics,
- working on specialized AFM modes (KPFM, MFM, FMM, etc.) without validation,
- non-biological samples, without first validating performance on representative images.
## Bias, Risks, and Limitations
- The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
- Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate mask predictions.
- The mask model cannot simultaneously threshold both high features (blobs) and
low features (holes); images containing both may not be levelled optimally. In such cases the
[**background model**](https://huggingface.co/Heath-AFM-Lab/afMLevel-background-unet) is recommended instead.
- Users should visually inspect masks and levelled outputs before scientific interpretation.
### Recommendations
- Manually verify a representative subset of levelled images.
- For images with complex topographies (e.g. three or more distinct height
planes, or both holes and blobs), consider using the background model
([`MLBackground`](https://huggingface.co/Heath-AFM-Lab/afMLevel-background-unet)) as an alternative or complement.
- Avoid applying the model to imaging modes it was not trained on without validation.
## How to Get Started with the Model
Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles inference,
binary mask generation and levelling. Demonstration notebooks are available
[within the GitHub repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).
## Training Details
The model was trained from scratch on real AFM topography data using the PyTorch framework.
### Training Data
This model was trained on a **dataset of 2,001 real AFM height‑map images** from several AFM labs, spanning
a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK,
Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes).
To increase dataset size and improve generalization, images were augmented using:
- reflection along the y-axis,
- rotation by 180°,
- synthetic line-noise artefacts.
This produced **10,005 training images** for the mask model.
An **80:20 train‑validation split** was used.
Ground-truth labels were derived from manually levelled images by applying Otsu thresholding
to the manually levelled image.
### Training Procedure
- **Architecture:** 7‑layer U‑Net with large convolutional filters (7×7)
- **Framework:** PyTorch
- **Optimizer:** Adam
- **Learning rate:** 0.0005
- **Objective:** classification per pixel
- **Activation function:** ReLU
- **Batch normalisation:** applied after each convolutional layer
- **Dropout:** none (p = 0.0)
- **Batch size:** 32
- **Hardware:** trained using GPU acceleration
- **Training images:** 10,005 (mask model)
- **Train/validation split:** 80:20 (random)
- Loss functions used: Binary Cross-Entropy (BCE) and Dice loss (combined as a single loss: BCE + Dice)
- Loss‑curve diagnostics were used to monitor convergence.
#### Preprocessing
Input images were preprocessed identically to the training data:
1. An initial 1st-order plane fit applied in x and y.
2. Min-max normalisation to [0, 1].
3. Resizing to 256 × 256.
For the **mask model**, nearest-neighbour interpolation was used for resizing.
#### Training Hyperparameters
- **Training regime:** fp32
#### Speeds, Sizes, Times
- **Model file size:** 598 MiB
- **Training epochs:** 69
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
## Evaluation
The performance of the ML‑generated masks was evaluated indirectly through their impact on
automated levelling. The core quantitative metric used for assessment was the **Mean Squared Error** (MSE)
between the output of the afMLevel auto-levelling routines, using ML model generated masks, and a manually
levelled ground‑truth image. The results were also assessed visually by the developers.
Full details are in the paper: In preparation (check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
### Testing Data & Metrics
#### Testing Data
Evaluation was performed on a held‑out set of real AFM height‑map images spanning a wide range of:
- biological samples,
- imaging conditions,
- noise levels,
- numbers of surface planes,
- scan artefacts (e.g., line noise, streaks).
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
- **Primary metric**: MSE between auto‑levelled and manually levelled images.
- **Distribution analysis:** mean vs. median MSE; a large difference between
the two indicates that failed levelling produces pronounced artefacts.
- **Success‑rate**: proportion of images below an MSE threshold of 0.1, selected as
a conservative boundary for “well‑levelled” outputs.
- **Visual inspection score:** percentage of images judged well-levelled by
developer inspection, used as a complementary subjective metric.
### Results
Initial internal testing indicates that the ML‑generated masks enable reliable automated
levelling across a broad range of AFM images, including scans with varied noise levels and
multiple height planes. Quantitative results and statistical analyses will be provided in
the accompanying paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).
## Citation
Tekchandani et al., *AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies* (in preparation, 2026)
## Model Card Authors
- **Maya Tekchandani**
- **Dr Daniel E. Rollins**
- **Dr George R. Heath**
## Contact
For questions or issues, please open a GitHub issue at
https://github.com/mayatek1/afMLevel or contact:
**George R. Heath- University of Leeds**
Email: G.R.Heath@leeds.ac.uk