dennisjooo
/

emotion_classification

Image Classification

Generated from Trainer

Eval Results (legacy)

Model card Files Files and versions

dennisjooo commited on Sep 13, 2023

Commit

9e21d9b

·

1 Parent(s): 170a155

Update README.md

Files changed (1) hide show

README.md +9 -5

README.md CHANGED Viewed

@@ -36,10 +36,13 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# emotion_classification
 This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)
 on the [FastJobs/Visual_Emotional_Analysis](https://huggingface.co/datasets/FastJobs/Visual_Emotional_Analysis) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.1031
 - Accuracy: 0.6312
@@ -49,19 +52,20 @@ It achieves the following results on the evaluation set:
 ## Model description
 The Vision Transformer base version trained on ImageNet-21K released by Google.
-Further details can be found on their [repo]((https://huggingface.co/google/vit-base-patch16-224-in21k))
 ## Training and evaluation data
 ### Data Split
-Used a 4:1 ratio for training and development sets and a seed of 42.
 ### Pre-processing Augmentation
 The main pre-processing phase for both training and evaluation includes:
-- Resizing to (224, 224, 3) because it uses ImageNet images to train the original model
-- Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5]
 Other than the aforementioned pre-processing, the training set was augmented using:
 - Random horizontal & vertical flip

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# Emotion Classification
 This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)
 on the [FastJobs/Visual_Emotional_Analysis](https://huggingface.co/datasets/FastJobs/Visual_Emotional_Analysis) dataset.
+In theory, the accuracy for a random guess on this dataset is 0.1429.
 It achieves the following results on the evaluation set:
 - Loss: 1.1031
 - Accuracy: 0.6312
 ## Model description
 The Vision Transformer base version trained on ImageNet-21K released by Google.
+Further details can be found on their [repo](https://huggingface.co/google/vit-base-patch16-224-in21k).
 ## Training and evaluation data
 ### Data Split
+Used a 4:1 ratio for training and development sets and a random seed of 42.
+Also used a seed of 42 for batching the data, completely unrelated lol.
 ### Pre-processing Augmentation
 The main pre-processing phase for both training and evaluation includes:
+- Bilinear interpolation to resize the image to (224, 224, 3) because it uses ImageNet images to train the original model
+- Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5] just like the original model
 Other than the aforementioned pre-processing, the training set was augmented using:
 - Random horizontal & vertical flip