maceythm commited on
Commit
e4c6682
·
verified ·
1 Parent(s): 7538403

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -16
README.md CHANGED
@@ -30,33 +30,43 @@ model-index:
30
  type: accuracy
31
  value: 0.9796296296296296
32
  ---
33
-
34
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
35
- should probably proofread and complete it, then remove this comment. -->
36
-
37
  # vit-90-animals
38
-
39
- This model is a fine-tuned version of [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) on the iamsouravbanerjee/animal-image-dataset-90-different-animals dataset.
40
- It achieves the following results on the evaluation set:
41
- - Loss: 0.0840
42
- - Accuracy: 0.9796
43
 
44
  ## Model description
 
45
 
46
- More information needed
 
 
47
 
48
  ## Intended uses & limitations
 
 
 
 
49
 
50
- More information needed
 
 
 
51
 
52
  ## Training and evaluation data
 
53
 
54
- More information needed
55
 
56
  ## Training procedure
 
 
 
 
 
 
 
57
 
58
  ### Training hyperparameters
59
-
60
  The following hyperparameters were used during training:
61
  - learning_rate: 0.0003
62
  - train_batch_size: 16
@@ -67,7 +77,6 @@ The following hyperparameters were used during training:
67
  - num_epochs: 5
68
 
69
  ### Training results
70
-
71
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
72
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
73
  | 1.2021 | 1.0 | 270 | 0.3500 | 0.9611 |
@@ -76,9 +85,7 @@ The following hyperparameters were used during training:
76
  | 0.1706 | 4.0 | 1080 | 0.1409 | 0.9685 |
77
  | 0.1678 | 5.0 | 1350 | 0.1373 | 0.9667 |
78
 
79
-
80
  ### Framework versions
81
-
82
  - Transformers 4.50.0
83
  - Pytorch 2.6.0+cu124
84
  - Datasets 3.4.1
 
30
  type: accuracy
31
  value: 0.9796296296296296
32
  ---
33
+ ___
 
 
 
34
  # vit-90-animals
35
+ ___
 
 
 
 
36
 
37
  ## Model description
38
+ This model is a fine-tuned Vision Transformer version of [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) on the [animal image dataset](https://www.kaggle.com/datasets/iamsouravbanerjee/animal-image-dataset-90-different-animals) from kaggle - trained to classify images into 90 different animal species. It achieves high accuracy on unseen data and was trained using supervised learning. The model can be used for general-purpose image classification in the animal domain and serves as a comparison baseline for zero-shot classification models such as CLIP.
39
 
40
+ The model achieves the following results on the evaluation set:
41
+ - Loss: 0.0840
42
+ - Accuracy: 0.9796
43
 
44
  ## Intended uses & limitations
45
+ ### Intended uses
46
+ - Animal image classification (educational, demo, prototyping)
47
+ - Benchmarking against zero-shot classification models
48
+ - Use in Gradio interfaces or image analysis tools
49
 
50
+ ### Limitations
51
+ - The model is limited to the 90 animal classes it was trained on
52
+ - It may not generalize well to image domains outside of its training distribution
53
+ - Performance can degrade with poor image quality or occlusions
54
 
55
  ## Training and evaluation data
56
+ The model was trained on a dataset containing 5,400 animal images categorized into 90 distinct classes. The dataset was obtained from Kaggle and according to the creator originally sourced from Google Images. The training/validation/test split was 80/10/10, and the label distribution is relatively balanced across classes.
57
 
58
+ Evaluation was conducted on the test split and compared to results from a zero-shot model (*openai/clip-vit-large-patch14*) using the same label set.
59
 
60
  ## Training procedure
61
+ - Base model: *google/vit-base-patch16-224*
62
+ - Fine-tuning method: Supervised training using the Hugging Face Trainer class
63
+ - Data augmentation: Applied during training (e.g., RandomHorizontalFlip, ColorJitter)
64
+ - Training time: ~5 epochs with and without augmentation
65
+ - Optimizer: AdamW (default settings)
66
+ - Evaluation metrics: Accuracy, precision, and recall
67
+ - Best performance (no augmentation): 98.3% test accuracy
68
 
69
  ### Training hyperparameters
 
70
  The following hyperparameters were used during training:
71
  - learning_rate: 0.0003
72
  - train_batch_size: 16
 
77
  - num_epochs: 5
78
 
79
  ### Training results
 
80
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
81
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
82
  | 1.2021 | 1.0 | 270 | 0.3500 | 0.9611 |
 
85
  | 0.1706 | 4.0 | 1080 | 0.1409 | 0.9685 |
86
  | 0.1678 | 5.0 | 1350 | 0.1373 | 0.9667 |
87
 
 
88
  ### Framework versions
 
89
  - Transformers 4.50.0
90
  - Pytorch 2.6.0+cu124
91
  - Datasets 3.4.1