π Hybrid Model Results (HMDB51)
The hybrid model (TimeSformer + RetNet) was also trained on the HMDB51 dataset.
Due to Kaggleβs runtime limitation, training was interrupted at Epoch 12, so results are reported up to Epoch 11. Training will be resumed in a later stage.
πΉ Training Results (Epoch 1β11)
| Epoch | Train Loss | Train Acc | Val Loss | Val Acc | F1 |
|---|---|---|---|---|---|
| 1 | 3.9312 | 0.0350 | 3.8099 | 0.0967 | 0.0855 |
| 2 | 3.6330 | 0.1791 | 3.2948 | 0.3654 | 0.3149 |
| 3 | 3.0989 | 0.3691 | 2.6927 | 0.5150 | 0.4579 |
| 4 | 2.6278 | 0.5048 | 2.2879 | 0.5869 | 0.5503 |
| 5 | 2.3198 | 0.5782 | 2.0438 | 0.6255 | 0.5961 |
| 6 | 2.1387 | 0.6194 | 1.9152 | 0.6242 | 0.6074 |
| 7 | 1.9876 | 0.6657 | 1.8369 | 0.6418 | 0.6308 |
| 8 | 1.9140 | 0.6936 | 1.7966 | 0.6359 | 0.6188 |
| 9 | 1.8539 | 0.7041 | 1.7619 | 0.6556 | 0.6426 |
| 10 | 1.8149 | 0.7244 | 1.7523 | 0.6614 | 0.6512 |
| 11 | 1.7270 | 0.7561 | 1.7543 | 0.6556 | 0.6472 |
π Best Performance (Current)
- Validation Accuracy: 66.14%
- F1 Score: 0.6512
- Achieved at Epoch 10
β οΈ Training Status
- Training interrupted at Epoch 12 due to runtime limit
- Model will be resumed from best checkpoint
- Final performance may improve after full training
β‘ Efficiency
- Peak GPU Memory: ~7.2 GB
- ~25% lower than standard TimeSformer
- Faster training per epoch
π Observations
- Steady improvement until Epoch 10
- Slight plateau after that (possible early convergence)
- Lower accuracy compared to UCF101 (expected due to dataset complexity)
π Next Steps
- Resume training from Epoch 11 checkpoint
- Complete remaining epochs
- Compare final performance with baseline model