t0m-R commited on
Commit
91a5362
·
1 Parent(s): 253a48a

Upload ViT-B/16 STM artifact detection model

Browse files
Files changed (3) hide show
  1. README.md +86 -3
  2. config.json +16 -0
  3. pytorch_model.bin +3 -0
README.md CHANGED
@@ -1,3 +1,86 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language: en
4
+ tags:
5
+ - image-classification
6
+ - vision-transformer
7
+ - pytorch
8
+ - stm
9
+ - materials-science
10
+ - nffa-di
11
+ base_model:
12
+ - google/vit-base-patch16-224-in21k
13
+ pipeline_tag: image-classification
14
+ ---
15
+
16
+ # Vision Transformer for STM Multi-Tip Artifact Detection
17
+
18
+ This is a fine-tuned **Vision Transformer (ViT-B/16)** model for classifying Scanning Tunneling Microscopy (STM) images. It is designed to detect the presence of **multi-tip artifacts**, a common distortion that results in duplicated signals and complicates data interpretation.
19
+
20
+ This model was developed as part of the **NFFA-DI (Nano Foundries and Fine Analysis Digital Infrastructure)** project, funded by the European Union's NextGenerationEU program.
21
+
22
+
23
+
24
+ ## Model Description
25
+
26
+ The model is a `ViT-B/16` pre-trained on ImageNet-21k. It was fine-tuned to classify an STM image as either `Artifact-Free` or `Multi-Tip Artifact`.
27
+
28
+ A key feature of this model is its use of a **Fast Fourier Transform (FFT)** based preprocessing method. The model's input is not a standard image but a 3-channel tensor composed of:
29
+ 1. The grayscale STM image.
30
+ 2. The **amplitude** of the image's Fourier transform.
31
+ 3. The **phase** of the image's Fourier transform.
32
+
33
+ This approach significantly improves the model's ability to identify the subtle patterns characteristic of multi-tip artifacts.
34
+
35
+ ## How to Use
36
+
37
+ The following Python code shows how to load and use the model for inference.
38
+
39
+ ```python
40
+ from transformers import AutoModelForImageClassification
41
+ import torch
42
+
43
+ # Load the model from the Hub
44
+ model_name = "YourUsername/vit-stm-artifact-fft" # Replace with your repo name
45
+ model = AutoModelForImageClassification.from_pretrained(model_name)
46
+
47
+ # NOTE: This model requires a custom FFT-based preprocessing function.
48
+ # The 'preprocessed_image' tensor must have a shape of (1, 3, 224, 224).
49
+ # See the "Preprocessing" section for details.
50
+ # preprocessed_image = your_custom_fft_preprocessing_function("path/to/your/stm_image.tiff")
51
+
52
+ # Run inference
53
+ with torch.no_grad():
54
+ logits = model(preprocessed_image).logits
55
+ predicted_label_id = logits.argmax(-1).item()
56
+ predicted_label = model.config.id2label[predicted_label_id]
57
+
58
+ print(f"Predicted Label: {predicted_label}")
59
+ # Expected output: "Predicted Label: Multi-Tip Artifact"
60
+ ```
61
+
62
+ ## Preprocessing
63
+
64
+ **This model will not work with standard image preprocessing.** The input must be a 3-channel tensor representing the grayscale image, FFT amplitude, and FFT phase. Please refer to the original paper for the exact implementation details. The core steps involve:
65
+
66
+ * Loading the image as grayscale and resizing it to 224x224.
67
+ * Applying a 2D Fast Fourier Transform (`numpy.fft.fft2`).
68
+ * Calculating the amplitude (`np.abs`) and phase (`np.angle`).
69
+ * Normalizing and stacking the three channels into a single tensor.
70
+
71
+ ## Training Data
72
+
73
+ The model was fine-tuned on a synthetic dataset generated from experimental STM images recorded at CNR-IOM, Trieste. Artifact-free images were transformed into synthetic multi-tip images by summing the clean image with translated and intensity-scaled versions of itself.
74
+
75
+ ## Citation
76
+
77
+ If you use this model in your research, please cite the original work:
78
+
79
+ ```bibtex
80
+ @article{rodani2024enhancing,
81
+ title={Enhancing Multi-Tip Artifact Detection in STM Images Using Fourier Transform and Vision Transformers},
82
+ author={Rodani, Tommaso and Ansuini, Alessio and Cazziga, Alberto},
83
+ journal={Accepted at the 1st Machine Learning for Life and Material Sciences Workshop at ICML},
84
+ year={2024}
85
+ }
86
+ ```
config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "google/vit-base-patch16-224-in21k",
3
+ "architectures": [
4
+ "ViTForImageClassification"
5
+ ],
6
+ "model_type": "vit",
7
+ "num_labels": 2,
8
+ "id2label": {
9
+ "0": "Artifact-Free",
10
+ "1": "Multi-Tip Artifact"
11
+ },
12
+ "label2id": {
13
+ "Artifact-Free": 0,
14
+ "Multi-Tip Artifact": 1
15
+ }
16
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d3aaaf677542934b42ab898915c555d07337b4a904bd533eb6f50720a92f8d3
3
+ size 343264618