| | --- |
| | license: mit |
| | base_model: google/vivit-b-16x2 |
| | tags: |
| | - cctv-surveillance |
| | - video-classification |
| | metrics: |
| | - accuracy |
| | - f1 |
| | - recall |
| | - precision |
| | --- |
| | |
| | ## Model Performance |
| |
|
| | The model achieved the following scores on the evaluation dataset: |
| |
|
| | - **Accuracy**: 94.6% |
| | - **F1 Score**: 94.3% |
| | - **Recall**: 94.6% |
| | - **Precision**: 94.5% |
| |
|
| | ## Intended Use & Limitations |
| |
|
| | - **Best for:** CCTV footage analysis, anomaly detection |
| | - **Not suitable for:** Non-surveillance video types, real-time processing with limited hardware |
| |
|
| | ## Training Details |
| |
|
| | - **Learning Rate:** 5e-6 |
| | - **Batch Size:** 2 |
| | - **Optimizer:** Adam |
| | - **Training Steps:** 4176 |
| |
|
| | ## Framework Versions |
| |
|
| | - Transformers: 4.39.3 |
| | - PyTorch: 2.1.2 |
| | - Datasets: 2.18.0 |
| |
|
| |
|