| --- |
| library_name: transformers |
| license: apache-2.0 |
| base_model: google/vit-base-patch16-224-in21k |
| tags: |
| - image-classification |
| - generated_from_trainer |
| datasets: |
| - imagefolder |
| metrics: |
| - accuracy |
| model-index: |
| - name: vit-base-rocks |
| results: |
| - task: |
| name: Image Classification |
| type: image-classification |
| dataset: |
| name: rocks |
| type: imagefolder |
| config: default |
| split: validation |
| args: default |
| metrics: |
| - name: Accuracy |
| type: accuracy |
| value: 0.7777777777777778 |
| --- |
| |
| <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| should probably proofread and complete it, then remove this comment. --> |
|
|
| # vit-base-rocks |
|
|
| This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) on the rocks dataset. |
| It achieves the following results on the evaluation set: |
| - Loss: 0.7099 |
| - Accuracy: 0.7778 |
|
|
| ## Model description |
|
|
| This model is a fine-tuned version of Google's vit-base-patch16-224-in21k designed to identify geological hand samples. |
|
|
| ## Intended uses & limitations |
|
|
| Currently the VIT is fine-tuned on 10 classes: |
|
|
| ['Andesite', 'Basalt', 'Chalk', 'Dolomite', 'Flint', 'Gneiss', 'Granite', 'Limestone', 'Sandstone', 'Slate'] |
|
|
| Future iteartions of the model will feature an expanded breadth of rock categories. |
|
|
|
|
| ## Training and evaluation data |
|
|
| The model performs relatively well on 10 classes of rocks - with some confusion between limestone and other carbonates. |
|
|
|  |
|
|
|
|
| ## Training procedure |
|
|
| 495 images of geological hand samples were selected with an 80:20 train-test/validation split. |
|
|
| Classes were roughly equally represented across the 495 samples. |
|
|
| ### Training hyperparameters |
|
|
| The following hyperparameters were used during training: |
| - learning_rate: 0.0002 |
| - train_batch_size: 64 |
| - eval_batch_size: 8 |
| - seed: 42 |
| - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| - lr_scheduler_type: linear |
| - num_epochs: 25 |
|
|
| ### Training results |
|
|
| | Training Loss | Epoch | Step | Validation Loss | Accuracy | |
| |:-------------:|:-------:|:----:|:---------------:|:--------:| |
| | 2.0408 | 1.4286 | 10 | 1.7371 | 0.6111 | |
| | 1.4489 | 2.8571 | 20 | 1.3254 | 0.7407 | |
| | 0.9469 | 4.2857 | 30 | 1.0768 | 0.7407 | |
| | 0.586 | 5.7143 | 40 | 0.9118 | 0.7778 | |
| | 0.3757 | 7.1429 | 50 | 0.9902 | 0.6852 | |
| | 0.2798 | 8.5714 | 60 | 0.8498 | 0.7778 | |
| | 0.2087 | 10.0 | 70 | 0.7939 | 0.7407 | |
| | 0.176 | 11.4286 | 80 | 0.8220 | 0.7222 | |
| | 0.1613 | 12.8571 | 90 | 0.7288 | 0.8148 | |
| | 0.1337 | 14.2857 | 100 | 0.7178 | 0.7963 | |
| | 0.1326 | 15.7143 | 110 | 0.7403 | 0.7778 | |
| | 0.119 | 17.1429 | 120 | 0.7099 | 0.7778 | |
| | 0.1193 | 18.5714 | 130 | 0.7626 | 0.7778 | |
| | 0.1227 | 20.0 | 140 | 0.7125 | 0.7963 | |
| | 0.1102 | 21.4286 | 150 | 0.7493 | 0.7963 | |
| | 0.1134 | 22.8571 | 160 | 0.7396 | 0.7963 | |
| | 0.1173 | 24.2857 | 170 | 0.7187 | 0.7963 | |
|
|
|
|
| ### Framework versions |
|
|
| - Transformers 4.48.3 |
| - Pytorch 2.6.0 |
| - Datasets 3.3.0 |
| - Tokenizers 0.21.0 |
|
|