--- library_name: transformers license: apache-2.0 base_model: google/vit-base-patch16-224-in21k tags: - image-classification - generated_from_trainer datasets: - imagefolder metrics: - accuracy model-index: - name: vit-base-rocks results: - task: name: Image Classification type: image-classification dataset: name: rocks type: imagefolder config: default split: validation args: default metrics: - name: Accuracy type: accuracy value: 0.7777777777777778 --- # vit-base-rocks This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) on the rocks dataset. It achieves the following results on the evaluation set: - Loss: 0.7099 - Accuracy: 0.7778 ## Model description This model is a fine-tuned version of Google's vit-base-patch16-224-in21k designed to identify geological hand samples. ## Intended uses & limitations Currently the VIT is fine-tuned on 10 classes: ['Andesite', 'Basalt', 'Chalk', 'Dolomite', 'Flint', 'Gneiss', 'Granite', 'Limestone', 'Sandstone', 'Slate'] Future iteartions of the model will feature an expanded breadth of rock categories. ## Training and evaluation data The model performs relatively well on 10 classes of rocks - with some confusion between limestone and other carbonates. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/67b218c9f745d44676c938cb/zZXcIybZLvUtEKpb8Lk8u.png) ## Training procedure 495 images of geological hand samples were selected with an 80:20 train-test/validation split. Classes were roughly equally represented across the 495 samples. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 64 - eval_batch_size: 8 - seed: 42 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - num_epochs: 25 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-------:|:----:|:---------------:|:--------:| | 2.0408 | 1.4286 | 10 | 1.7371 | 0.6111 | | 1.4489 | 2.8571 | 20 | 1.3254 | 0.7407 | | 0.9469 | 4.2857 | 30 | 1.0768 | 0.7407 | | 0.586 | 5.7143 | 40 | 0.9118 | 0.7778 | | 0.3757 | 7.1429 | 50 | 0.9902 | 0.6852 | | 0.2798 | 8.5714 | 60 | 0.8498 | 0.7778 | | 0.2087 | 10.0 | 70 | 0.7939 | 0.7407 | | 0.176 | 11.4286 | 80 | 0.8220 | 0.7222 | | 0.1613 | 12.8571 | 90 | 0.7288 | 0.8148 | | 0.1337 | 14.2857 | 100 | 0.7178 | 0.7963 | | 0.1326 | 15.7143 | 110 | 0.7403 | 0.7778 | | 0.119 | 17.1429 | 120 | 0.7099 | 0.7778 | | 0.1193 | 18.5714 | 130 | 0.7626 | 0.7778 | | 0.1227 | 20.0 | 140 | 0.7125 | 0.7963 | | 0.1102 | 21.4286 | 150 | 0.7493 | 0.7963 | | 0.1134 | 22.8571 | 160 | 0.7396 | 0.7963 | | 0.1173 | 24.2857 | 170 | 0.7187 | 0.7963 | ### Framework versions - Transformers 4.48.3 - Pytorch 2.6.0 - Datasets 3.3.0 - Tokenizers 0.21.0