t0m-R
commited on
Commit
·
a20e54a
1
Parent(s):
ac2dc19
Upload Vit-B/8 SEM scale classification model
Browse files- README.md +81 -0
- config.json +47 -0
- model.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language: en
|
| 4 |
+
tags:
|
| 5 |
+
- image-classification
|
| 6 |
+
- vision-transformer
|
| 7 |
+
- pytorch
|
| 8 |
+
- sem
|
| 9 |
+
- materials-science
|
| 10 |
+
- nffa-di
|
| 11 |
+
base_model: timm/vit_base_patch8_224.augreg2_in21k_ft_in1k
|
| 12 |
+
pipeline_tag: image-classification
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Vision Transformer for SEM Image Scale Classification
|
| 16 |
+
|
| 17 |
+
This is a fine-tuned **Vision Transformer (ViT-B/8)** model for classifying the magnification scale of Scanning Electron Microscopy (SEM) images—**pico, nano, or micro**—directly from pixel data.
|
| 18 |
+
|
| 19 |
+
The model addresses the challenge of unreliable scale information in large SEM archives, which is often hindered by proprietary file formats or error-prone Optical Character Recognition (OCR).
|
| 20 |
+
|
| 21 |
+
This model was developed as part of the **NFFA-DI (Nano Foundries and Fine Analysis Digital Infrastructure)** project, funded by the European Union's NextGenerationEU program.
|
| 22 |
+
|
| 23 |
+
## Model Description
|
| 24 |
+
|
| 25 |
+
The model is based on the `timm/vit_base_patch8_224.augreg2_in21k_ft_in1k` checkpoint and has been fine-tuned for a 3-class image classification task on SEM images. The three scale categories are:
|
| 26 |
+
|
| 27 |
+
1. **Pico**: Images where the pixel size is in the atomic or sub-nanometer scale (less than 1 nm).
|
| 28 |
+
2. **Nano**: Images where the pixel size is in the nanometer range (1 nm to 1,000 nm, or 1 µm).
|
| 29 |
+
3. **Micro**: Images where the pixel size is in the micrometer scale (greater than 1 µm).
|
| 30 |
+
|
| 31 |
+
## Model Performance
|
| 32 |
+
|
| 33 |
+
The model achieves **91,7% accuracy** on a held-out test set. Notably, most misclassifications occur at the transitional nano-micro boundary, which indicates that the model is learning physically meaningful feature representations related to the magnification level.
|
| 34 |
+
|
| 35 |
+
## How to Use
|
| 36 |
+
|
| 37 |
+
The following Python code shows how to load the model and its processor from the Hub and use it to classify a local SEM image.
|
| 38 |
+
|
| 39 |
+
```python
|
| 40 |
+
from transformers import AutoImageProcessor, AutoModelForImageClassification
|
| 41 |
+
from PIL import Image
|
| 42 |
+
import torch
|
| 43 |
+
|
| 44 |
+
# Load the model and image processor from the Hub
|
| 45 |
+
model_name = "t0m-R/vit-sem-scale-classifier"
|
| 46 |
+
image_processor = AutoImageProcessor.from_pretrained(model_name)
|
| 47 |
+
model = AutoModelForImageClassification.from_pretrained(model_name)
|
| 48 |
+
|
| 49 |
+
# Load and preprocess the image
|
| 50 |
+
image_path = "path/to/your/sem_image.png"
|
| 51 |
+
try:
|
| 52 |
+
image = Image.open(image_path).convert("RGB")
|
| 53 |
+
|
| 54 |
+
# Prepare the image for the model
|
| 55 |
+
inputs = image_processor(images=image, return_tensors="pt")
|
| 56 |
+
|
| 57 |
+
# Run inference
|
| 58 |
+
with torch.no_grad():
|
| 59 |
+
logits = model(**inputs).logits
|
| 60 |
+
predicted_label_id = logits.argmax(-1).item()
|
| 61 |
+
predicted_label = model.config.id2label[predicted_label_id]
|
| 62 |
+
|
| 63 |
+
print(f"Predicted Scale: {predicted_label}")
|
| 64 |
+
|
| 65 |
+
except FileNotFoundError:
|
| 66 |
+
print(f"Error: The file at {image_path} was not found.")
|
| 67 |
+
```
|
| 68 |
+
## Training Data
|
| 69 |
+
|
| 70 |
+
This model was fine-tuned on a custom dataset of 17,700 Scanning Electron Microscopy (SEM) images, curated specifically for this project.
|
| 71 |
+
The images were selected to create a balanced dataset for the task of scale classification. This set contains an equal one-third split of images corresponding to the pico, nano, and micro scales (5,900 images per class).
|
| 72 |
+
|
| 73 |
+
The 17,700 images were then divided into:
|
| 74 |
+
|
| 75 |
+
Training set: 12,000 images
|
| 76 |
+
|
| 77 |
+
Validation set: 3,000 images
|
| 78 |
+
|
| 79 |
+
Test set: 2,700 images
|
| 80 |
+
|
| 81 |
+
**Note on Availability**: This dataset is not publicly available at the moment but is planned for publication at a later stage. Please check this model card for future updates on data access.
|
config.json
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architecture": "vit_base_patch8_224",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"TimmWrapperForImageClassification"
|
| 5 |
+
],
|
| 6 |
+
"do_pooling": true,
|
| 7 |
+
"dtype": "float32",
|
| 8 |
+
"global_pool": "token",
|
| 9 |
+
"initializer_range": 0.02,
|
| 10 |
+
"label_names": [
|
| 11 |
+
"pico",
|
| 12 |
+
"nano",
|
| 13 |
+
"micro"
|
| 14 |
+
],
|
| 15 |
+
"model_args": null,
|
| 16 |
+
"model_type": "timm_wrapper",
|
| 17 |
+
"num_classes": 3,
|
| 18 |
+
"num_features": 768,
|
| 19 |
+
"pretrained_cfg": {
|
| 20 |
+
"classifier": "head",
|
| 21 |
+
"crop_mode": "center",
|
| 22 |
+
"crop_pct": 0.9,
|
| 23 |
+
"custom_load": false,
|
| 24 |
+
"first_conv": "patch_embed.proj",
|
| 25 |
+
"fixed_input_size": true,
|
| 26 |
+
"input_size": [
|
| 27 |
+
3,
|
| 28 |
+
224,
|
| 29 |
+
224
|
| 30 |
+
],
|
| 31 |
+
"interpolation": "bicubic",
|
| 32 |
+
"mean": [
|
| 33 |
+
0.5,
|
| 34 |
+
0.5,
|
| 35 |
+
0.5
|
| 36 |
+
],
|
| 37 |
+
"pool_size": null,
|
| 38 |
+
"std": [
|
| 39 |
+
0.5,
|
| 40 |
+
0.5,
|
| 41 |
+
0.5
|
| 42 |
+
],
|
| 43 |
+
"tag": "augreg2_in21k_ft_in1k"
|
| 44 |
+
},
|
| 45 |
+
"problem_type": "single_label_classification",
|
| 46 |
+
"transformers_version": "4.56.0"
|
| 47 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cae76191450cf0c7b6c4f177e44433046a2dc4a69fd1eecefb28d70a3dd77826
|
| 3 |
+
size 343254828
|