Update README.md

a59724d verified 9 days ago

10.5 kB

	---
	license: bsd-3-clause
	pipeline_tag: image-to-image
	tags:
	- AFM
	- physics
	- biology
	- atomic-force-microscopy
	- microscopy
	- image-processing
	- unet
	- surface-analysis
	- chemistry
	- nanoscience
	---
	# afMLevel-background-unet

	This U‑Net model predicts tilt, z scanner drift, and other large‑scale imaging artifacts present in Atomic Force Microscopy (AFM) height maps.
	It outputs a background image, the same size and scale as the raw AFM image, which can be subtracted (via the accompanying [afMLevel](https://github.com/mayatek1/afMLevel)
	code) to produce a levelled height map.

	> Note: afMLevel includes a second model, afMLevel-mask-unet,
	> that predicts a feature map that can be used in traditional auto-levelling routines where line and plane
	> fits are applied to the background (unmasked, featureless regions) rather than directly
	> generating the noise background. Both models will be described in the accompanying paper and the mask model is found
	> here: [https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet](https://huggingface.co/Heath-AFM-Lab/afMLevel-mask-unet)

	## Model Details

	This model is part of the [afMLevel](https://github.com/mayatek1/afMLevel) project.

	### Model Description

	This model is a 7‑layer U‑Net architecture implemented in PyTorch, trained to perform image‑to‑image regression for background prediction in AFM height maps. The network was trained on 256 × 256‑pixel images and therefore expects inputs of this size at inference time.

	The afMLevel repository includes tools for:

	- image preprocessing and tiling,
	- running inference,
	- generating noise background,
	- noise background subtraction,
	- integrating the model into batch-processing pipelines for HS-AFM videos and image sets.

	### Model Card Information

	- Developed by: Maya Tekchandani
	- Maintained by: Dr Daniel E. Rollins, Dr George R. Heath
	- Principal Investigator: Dr George R. Heath
	- Affiliation: Department of Physics and Astronomy, University of Leeds, UK
	- Funded by:
	- Maya Tekchandani is supported by a studentship funded by the Engineering and Physical Sciences Research Council and the Biotechnology and Biological Sciences Research Council.
	- Dr Daniel E. Rollins and Dr George R. Heath funded by Engineering and Physical Science Research Council grant EP/W034735/1.
	- Shared by: [Heath-AFM-Lab](https://heath-afm-lab.github.io/)
	- Model type: U‑Net regression model for AFM background estimation
	- License: BSD‑3‑Clause
	- Finetuned from model: None (trained from scratch)

	### Model Sources

	- Repository: https://github.com/mayatek1/afMLevel
	- Paper: Tekchandani et al., AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies (in preparation, 2026)
	- Demo: [Demonstration notebooks](https://github.com/mayatek1/afMLevel/tree/main/notebooks)

	## Uses

	This model is designed for use within the [afMLevel](https://github.com/mayatek1/afMLevel/) `background_model` module.

	### Direct Use

	The [afMLevel](https://github.com/mayatek1/afMLevel/) inference code operates on NumPy arrays, so raw AFM files must first be loaded using an external reader such as [playnano](https://github.com/derollins/playNano), [AFMReader](https://github.com/AFM-SPM/AFMReader), or a custom loader. Once loaded, the afMLevel package and notebooks handle inference and output of either the predicted background or the levelled image (with the predicted background subtracted) directly.

	The model has been primarily tested on biological AFM data (membranes, proteins, DNA origami, lattices, fibres) and is best suited to that context, though it may generalise to other sample types with similar imaging characteristics.

	### Downstream Use

	- Integration into [playnano](https://github.com/derollins/playNano), enabling end-to-end reading and levelling of high-speed AFM (HS-AFM) movies; the `afMLevel` package works as a plugin for `playnano` for easy integration (use processing step `level_ml_bg`).

	### Out‑of‑Scope Use

	This model is not intended for:

	- prediction of physical or mechanical properties,
	- denoising heavily corrupted AFM scans outside the training distribution,
	- interpretation of AFM contact mechanics,
	- specialised AFM modes (KPFM, MFM, FMM, etc.) without validation,
	- non‑biological samples without performance verification.

	## Bias, Risks, and Limitations

	- The model was trained on a specific dataset of real AFM height maps; performance may degrade for very different imaging modes, scan sizes, or materials.
	- Extremely noisy scans or those containing jump‑to‑contact instabilities may produce inaccurate background predictions.
	- The model may occasionally identify horizontal sample features as part of the background, causing them to be subtracted from the levelled image; users should inspect predicted backgrounds carefully.
	- Users should visually inspect a subset of the predicted backgrounds to ensure sample features are not present, which can affect local pixel height values in the levelled image.
	- The levelled outputs should also be inspected visually before scientific interpretation.

	### Recommendations

	- Manually verify a subset of predicted backgrounds and levelled images.
	- Avoid applying the model to imaging modes it was not trained on without validation.

	## How to Get Started with the Model

	Use the model through the [afMLevel](https://github.com/mayatek1/afMLevel) repository, which handles inference, noise background generation and levelling. Demonstration notebooks are available [within the GitHub repository](https://github.com/mayatek1/afMLevel/tree/main/notebooks).

	## Inference Speed

	Per-frame processing time scales with image resolution in discrete steps rather than continuously: processing time is dominated by the inference step, and the number of inference steps increases as the image resolution crosses 256-pixel thresholds and additional tiles are generated. For a range of resolutions that map to the same number of tiles, per-frame processing time remains approximately constant, producing plateaus. The background model is slower than classical non-ML levelling routines but is not prohibitive for research pipelines at typical AFM and HS-AFM resolutions. Full timing benchmarks are provided in the accompanying paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).

	## Training Details

	The model was trained from scratch on real AFM topography data using the PyTorch framework.

	### Training Data

	This model was trained on a dataset of 2,001 real AFM height‑map images from several AFM labs, spanning a range of biological sample types (membranes, proteins, DNA origami, lattices, fibres), AFM instruments (JPK, Bruker/RIBM, Asylum), and image features (blobs, holes, fibres, strong line noise, multiple planes).

	To increase dataset size and improve generalization, images were augmented using:

	- reflection along the y‑axis,
	- rotation by 180°.

	This produced 6,003 training images. An 80:20 train‑validation split was used.

	### Training Procedure

	- Architecture: 7‑layer U‑Net with large convolutional filters (9×9)
	- Framework: PyTorch
	- Optimizer: Adam
	- Learning rate: 0.0005
	- Objective: pixel‑wise continuous regression
	- Activation function: ReLU
	- Batch normalisation: applied after each convolutional layer
	- Dropout: none (p = 0.0)
	- Batch size: 32
	- Hardware: trained with GPU acceleration
	- Training images: 6,003
	- Train/validation split: 80:20 (random)
	- Loss function: Mean Squared Error (MSE)
	- Loss‑curve diagnostics were used to monitor convergence.

	#### Preprocessing

	Input images were preprocessed identically to the training data:

	1. An initial 1st-order plane fit applied in x and y.
	2. Min-max normalisation to [0, 1].
	3. Images larger than 256 × 256 are pixel-split into 256 × 256 tiles for inference.

	For the background model, a pixel-split method was employed for tiling to preserve all pixel values. Multiple 256 × 256 images were generated by taking alternating pixels.

	#### Training Hyperparameters

	- Training regime: fp32

	#### Speeds, Sizes, Times

	- Model file size: 982 MiB
	- Training epochs: 59

	## Evaluation

	The performance of the ML‑generated predicted backgrounds was evaluated indirectly through their impact on levelling.
	The core quantitative metric used for assessment was the Mean Squared Error (MSE) between the afMLevel predicted-background-subtracted
	output and a manually levelled ground‑truth image. The results were also assessed visually by the developers.

	Full details are in the paper: in preparation (check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).

	### Testing Data & Metrics

	#### Testing Data

	Evaluation was performed on a held‑out set of real AFM height maps spanning a wide range of:

	- biological sample types,
	- imaging conditions,
	- noise levels,
	- numbers of surface planes,
	- scan artefacts (e.g., streaks, line noise).

	#### Metrics

	- Primary metric: MSE between auto‑levelled and manually levelled images.
	- Distribution analysis: mean vs. median MSE; a large difference between the two indicates that failed levelling produces pronounced artefacts.
	- Success‑rate: proportion of images below an MSE threshold of 0.1, selected as a conservative boundary for "well‑levelled" outputs.
	- Visual inspection score: percentage of images judged well-levelled by developer inspection, used as a complementary subjective metric.

	### Results

	Initial internal testing indicates that the ML‑generated predicted backgrounds enable reliable automated levelling across a broad range of AFM images,
	including scans with varied noise levels and multiple height planes. Quantitative results and statistical analyses will be provided in the accompanying
	paper (in preparation, check the [GitHub](https://github.com/mayatek1/afMLevel) for updates).

	## Citation

	Tekchandani et al., AFMLevel: Deep learning U-Net Models for levelling Atomic Force Microscopy Images and Movies (in preparation, 2026)

	## Model Card Authors

	- Maya Tekchandani
	- Dr Daniel E. Rollins
	- Dr George R. Heath

	## Contact

	For questions or issues, please open a GitHub issue at https://github.com/mayatek1/afMLevel or contact:

	George R. Heath - University of Leeds

	Email: G.R.Heath@leeds.ac.uk