thomashk2001 commited on
Commit
05cc7e5
·
verified ·
1 Parent(s): d85e27f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -73
README.md CHANGED
@@ -1,78 +1,75 @@
 
1
  ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model: google/vit-base-patch16-224-in21k
5
  tags:
6
- - generated_from_trainer
7
- metrics:
8
- - accuracy
9
- - precision
10
- - recall
11
- - f1
12
- model-index:
13
- - name: tom_and_jerry_vit_model
14
- results: []
15
  ---
16
 
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
- # tom_and_jerry_vit_model
21
-
22
- This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) on an unknown dataset.
23
- It achieves the following results on the evaluation set:
24
- - Loss: 0.1530
25
- - Accuracy: 0.9562
26
- - Precision: 0.9526
27
- - Recall: 0.9587
28
- - F1: 0.9553
29
-
30
- ## Model description
31
-
32
- More information needed
33
-
34
- ## Intended uses & limitations
35
-
36
- More information needed
37
-
38
- ## Training and evaluation data
39
-
40
- More information needed
41
-
42
- ## Training procedure
43
-
44
- ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 0.0002
48
- - train_batch_size: 64
49
- - eval_batch_size: 64
50
- - seed: 42
51
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
- - lr_scheduler_type: linear
53
- - num_epochs: 5
54
-
55
- ### Training results
56
-
57
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
58
- |:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
59
- | 0.8223 | 0.4167 | 25 | 0.4506 | 0.8893 | 0.8939 | 0.8653 | 0.8742 |
60
- | 0.2676 | 0.8333 | 50 | 0.2195 | 0.9392 | 0.9343 | 0.9376 | 0.9356 |
61
- | 0.1896 | 1.25 | 75 | 0.1816 | 0.9526 | 0.9490 | 0.9504 | 0.9493 |
62
- | 0.1085 | 1.6667 | 100 | 0.1940 | 0.9380 | 0.9316 | 0.9381 | 0.9344 |
63
- | 0.1618 | 2.0833 | 125 | 0.1806 | 0.9477 | 0.9390 | 0.9493 | 0.9434 |
64
- | 0.0784 | 2.5 | 150 | 0.1582 | 0.9574 | 0.9524 | 0.9570 | 0.9546 |
65
- | 0.071 | 2.9167 | 175 | 0.1803 | 0.9416 | 0.9364 | 0.9413 | 0.9386 |
66
- | 0.0533 | 3.3333 | 200 | 0.1539 | 0.9611 | 0.9623 | 0.9600 | 0.9605 |
67
- | 0.0383 | 3.75 | 225 | 0.1446 | 0.9647 | 0.9654 | 0.9642 | 0.9646 |
68
- | 0.0264 | 4.1667 | 250 | 0.1619 | 0.9513 | 0.9447 | 0.9546 | 0.9488 |
69
- | 0.0227 | 4.5833 | 275 | 0.1524 | 0.9550 | 0.9498 | 0.9579 | 0.9531 |
70
- | 0.0343 | 5.0 | 300 | 0.1530 | 0.9562 | 0.9526 | 0.9587 | 0.9553 |
71
-
72
-
73
- ### Framework versions
 
 
 
 
74
 
75
- - Transformers 4.55.2
76
- - Pytorch 2.8.0+cu129
77
- - Datasets 4.0.0
78
- - Tokenizers 0.21.4
 
1
+
2
  ---
3
+ language:
4
+ - "es"
5
+ pretty_name: "Tom and Jerry Image Classification VIT Model"
6
  tags:
7
+ - "vision"
8
+ - "image-classification"
9
+ license: "cc0-1.0"
10
+ task_categories:
11
+ - "image-classification"
 
 
 
 
12
  ---
13
 
14
+ # Modelo VIT afinado para clasificación de imágenes de Tom y Jerry
15
+ ## Modelo base: 'google/vit-base-patch16-224-in21k'
16
+ EL modelo VIT fue ajusto para la clasificación de imágenes de Tom y Jerry en las siguientes categorías:
17
+ - Tom: Tom está en la imagen
18
+ - Jerry: Jerry está en la imagen
19
+ - Tom_and_Jerry: Tom y Jerry están en la imagen
20
+ - None: Ninguno está en la imagen
21
+
22
+ ## Metodología
23
+ - Se realizó el afinamiento del modelo con el dataset thomashk2001/tom_and_jerry_dataset. El cual se encuentra dividido en train, eval y testing.
24
+ - Los splits están estratificados por lo que hay de cada uno de los posibles labels en los splits.
25
+ - Se realizó el procesamiento de las imágenes con el ViTImageProcessor con el modelo 'google/vit-base-patch16-224-in21k'.
26
+ - Los argumentos de entrenamiento fueron:
27
+ ```
28
+ training_args = TrainingArguments(
29
+ output_dir="./vit_tom_jerry_mdl", # Checkpoints and saved model
30
+ per_device_train_batch_size=64,# Train batch size
31
+ per_device_eval_batch_size=64,# Eval batch size
32
+ num_train_epochs=5,# Number of epochs
33
+ learning_rate=2e-4,# LR rate
34
+ eval_strategy="steps",# Eval at the end of each step
35
+ eval_steps=25, # How often model is evaluated
36
+ save_strategy="steps", # Saves model every 100 steps
37
+ save_steps=100,
38
+ save_total_limit=5, # Model states saved including best model
39
+ load_best_model_at_end=True, # Loads best model at the end
40
+ logging_dir="./logs", # Lod dir
41
+ logging_steps=10, # Log register step
42
+ remove_unused_columns=False,
43
+ metric_for_best_model="f1", # Metric used for the best model
44
+ greater_is_better=True, # better f1 is looked after
45
+ )
46
+ ```
47
+ - Se aplicó el afinamiento del modelo con los parámetros definidos en el paso anterior y se uso early stopping con paciencia de 3.
48
+
49
+
50
+
51
+ ## Resultados del entrenamiento:
52
+ | Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
53
+ |------|---------------|----------------|-----------|-----------|--------|---------|
54
+ | 25 | 0.8223 | 0.4506 | 0.8893 | 0.8939 | 0.8653 | 0.8742 |
55
+ | 50 | 0.2676 | 0.2195 | 0.9392 | 0.9343 | 0.9376 | 0.9356 |
56
+ | 75 | 0.1896 | 0.1816 | 0.9526 | 0.9490 | 0.9504 | 0.9493 |
57
+ | 100 | 0.1085 | 0.1940 | 0.9380 | 0.9316 | 0.9381 | 0.9344 |
58
+ | 125 | 0.1618 | 0.1806 | 0.9477 | 0.9390 | 0.9493 | 0.9434 |
59
+ | 150 | 0.0784 | 0.1582 | 0.9574 | 0.9524 | 0.9570 | 0.9546 |
60
+ | 175 | 0.0710 | 0.1803 | 0.9416 | 0.9364 | 0.9413 | 0.9386 |
61
+ | 200 | 0.0533 | 0.1539 | 0.9611 | 0.9623 | 0.9600 | 0.9605 |
62
+ | 225 | 0.0383 | 0.1446 | 0.9647 | 0.9654 | 0.9642 | 0.9646 |
63
+ | 250 | 0.0264 | 0.1619 | 0.9513 | 0.9447 | 0.9546 | 0.9488 |
64
+ | 275 | 0.0227 | 0.1524 | 0.9550 | 0.9498 | 0.9579 | 0.9531 |
65
+ | 300 | 0.0343 | 0.1530 | 0.9562 | 0.9526 | 0.9587 | 0.9553 |
66
+
67
+ ## Mejor Modelo
68
+ - Step: 225
69
+ - Training Loss: 0.0383
70
+ - Validation Loss: 0.1446
71
+ - Accuracy: 0.9647
72
+ - Precision: 0.9654
73
+ - Recall: 0.9642
74
+ - F1 Score: 0.9646
75