Create REPORT2_Modification1_Training_Results_in_Colab
Browse files
REPORT2_Modification1_Training_Results_in_Colab
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
I modified the vanilla PiT model in a first undisclosed manner, not increasing the parameter count, and not changing the published hyperparameters, and the training results ultimately improved, although the modified model lagged slightly behind the unmodified PiT model until epoch 12.
|
| 2 |
+
My modified PiT model ultimately achieved Val Accuracy exceeding 95% VAl Accuracy (exceeding the final/highest 94.75% epoch 25 Val Accuracy of the vanilla PiT model) by Epoch 23.
|
| 3 |
+
|
| 4 |
+
--- Configuration V2.0 (2D Positional Embeddings) ---
|
| 5 |
+
train_file: /content/sample_data/mnist_train_small.csv
|
| 6 |
+
test_file: /content/sample_data/mnist_test.csv
|
| 7 |
+
image_size: 28
|
| 8 |
+
num_classes: 10
|
| 9 |
+
embed_dim: 128
|
| 10 |
+
num_layers: 6
|
| 11 |
+
num_heads: 8
|
| 12 |
+
mlp_dim: 512
|
| 13 |
+
dropout: 0.1
|
| 14 |
+
batch_size: 128
|
| 15 |
+
epochs: 25
|
| 16 |
+
learning_rate: 0.0001
|
| 17 |
+
device: cuda
|
| 18 |
+
image_height: 28
|
| 19 |
+
image_width: 28
|
| 20 |
+
sequence_length: 784
|
| 21 |
+
------------------------------------------------------
|
| 22 |
+
|
| 23 |
+
Data loaded. Training on cuda.
|
| 24 |
+
Training samples: 17999
|
| 25 |
+
Validation samples: 2000
|
| 26 |
+
Test samples: 9999
|
| 27 |
+
|
| 28 |
+
Model V2.0 initialized with 1,198,858 trainable parameters (more efficient!).
|
| 29 |
+
|
| 30 |
+
--- Starting Training (V2 Model) ---
|
| 31 |
+
Epoch 01/25 | Train Loss: 2.2214 | Val Loss: 2.0350 | Val Acc: 25.30%
|
| 32 |
+
-> New best validation accuracy! Saving model state.
|
| 33 |
+
Epoch 02/25 | Train Loss: 1.9045 | Val Loss: 1.6590 | Val Acc: 41.20%
|
| 34 |
+
-> New best validation accuracy! Saving model state.
|
| 35 |
+
Epoch 03/25 | Train Loss: 1.4483 | Val Loss: 1.0200 | Val Acc: 66.15%
|
| 36 |
+
-> New best validation accuracy! Saving model state.
|
| 37 |
+
Epoch 04/25 | Train Loss: 0.9364 | Val Loss: 0.6892 | Val Acc: 78.35%
|
| 38 |
+
-> New best validation accuracy! Saving model state.
|
| 39 |
+
Epoch 05/25 | Train Loss: 0.7182 | Val Loss: 0.5592 | Val Acc: 81.80%
|
| 40 |
+
-> New best validation accuracy! Saving model state.
|
| 41 |
+
Epoch 06/25 | Train Loss: 0.6149 | Val Loss: 0.4834 | Val Acc: 83.95%
|
| 42 |
+
-> New best validation accuracy! Saving model state.
|
| 43 |
+
Epoch 07/25 | Train Loss: 0.5266 | Val Loss: 0.4221 | Val Acc: 85.45%
|
| 44 |
+
-> New best validation accuracy! Saving model state.
|
| 45 |
+
Epoch 08/25 | Train Loss: 0.4720 | Val Loss: 0.3855 | Val Acc: 88.10%
|
| 46 |
+
-> New best validation accuracy! Saving model state.
|
| 47 |
+
Epoch 09/25 | Train Loss: 0.4334 | Val Loss: 0.3243 | Val Acc: 89.50%
|
| 48 |
+
-> New best validation accuracy! Saving model state.
|
| 49 |
+
Epoch 10/25 | Train Loss: 0.3871 | Val Loss: 0.3084 | Val Acc: 90.65%
|
| 50 |
+
-> New best validation accuracy! Saving model state.
|
| 51 |
+
Epoch 11/25 | Train Loss: 0.3581 | Val Loss: 0.2933 | Val Acc: 90.80%
|
| 52 |
+
-> New best validation accuracy! Saving model state.
|
| 53 |
+
Epoch 12/25 | Train Loss: 0.3424 | Val Loss: 0.2722 | Val Acc: 91.80%
|
| 54 |
+
-> New best validation accuracy! Saving model state.
|
| 55 |
+
Epoch 13/25 | Train Loss: 0.3193 | Val Loss: 0.2628 | Val Acc: 92.35%
|
| 56 |
+
-> New best validation accuracy! Saving model state.
|
| 57 |
+
Epoch 14/25 | Train Loss: 0.2926 | Val Loss: 0.2497 | Val Acc: 92.25%
|
| 58 |
+
Epoch 15/25 | Train Loss: 0.2758 | Val Loss: 0.2301 | Val Acc: 93.30%
|
| 59 |
+
-> New best validation accuracy! Saving model state.
|
| 60 |
+
Epoch 16/25 | Train Loss: 0.2637 | Val Loss: 0.2207 | Val Acc: 93.40%
|
| 61 |
+
-> New best validation accuracy! Saving model state.
|
| 62 |
+
Epoch 17/25 | Train Loss: 0.2468 | Val Loss: 0.2166 | Val Acc: 93.25%
|
| 63 |
+
Epoch 18/25 | Train Loss: 0.2368 | Val Loss: 0.1998 | Val Acc: 93.70%
|
| 64 |
+
-> New best validation accuracy! Saving model state.
|
| 65 |
+
Epoch 19/25 | Train Loss: 0.2233 | Val Loss: 0.2028 | Val Acc: 93.70%
|
| 66 |
+
Epoch 20/25 | Train Loss: 0.2101 | Val Loss: 0.1858 | Val Acc: 94.45%
|
| 67 |
+
-> New best validation accuracy! Saving model state.
|
| 68 |
+
Epoch 21/25 | Train Loss: 0.2017 | Val Loss: 0.1762 | Val Acc: 94.55%
|
| 69 |
+
-> New best validation accuracy! Saving model state.
|
| 70 |
+
Epoch 22/25 | Train Loss: 0.1922 | Val Loss: 0.1755 | Val Acc: 94.85%
|
| 71 |
+
-> New best validation accuracy! Saving model state.
|
| 72 |
+
Epoch 23/25 | Train Loss: 0.1786 | Val Loss: 0.1668 | Val Acc: 95.20%
|
| 73 |
+
-> New best validation accuracy! Saving model state.
|
| 74 |
+
Epoch 24/25 | Train Loss: 0.1751 | Val Loss: 0.1628 | Val Acc: 95.10%
|