Commit
·
4581f62
verified
·
0
Parent(s):
Super-squash branch 'main' using huggingface_hub
Browse files- .gitattributes +35 -0
- LICENSE +25 -0
- README.md +100 -0
- swiftvit-150.pth +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
LICENSE
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
SwiftVit/CNN Restricted Research License - BCS
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2025 Ballinrobe Community School
|
| 4 |
+
|
| 5 |
+
PERMISSIONS:
|
| 6 |
+
- Use restricted to Ballinrobe Community School students, faculty, and staff only
|
| 7 |
+
- No commercial use
|
| 8 |
+
- No derivative works
|
| 9 |
+
|
| 10 |
+
PERMITTED USE:
|
| 11 |
+
- Analysis of swift feeding behavior for climate research
|
| 12 |
+
- BT Young Scientist Exhibition 2025 entry only
|
| 13 |
+
|
| 14 |
+
Any other use requires explicit written authorization from Ballinrobe Community School.
|
| 15 |
+
Unauthorized use will be prosecuted to the fullest extent of applicable law.
|
| 16 |
+
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
AUTOMATIC LICENSE TRANSITION:
|
| 20 |
+
|
| 21 |
+
Effective January 20, 2026, this work is automatically relicensed under:
|
| 22 |
+
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
|
| 23 |
+
https://creativecommons.org/licenses/by-sa/4.0/
|
| 24 |
+
|
| 25 |
+
After this date, anyone may use, modify, and distribute this work under CC BY-NC-SA 4.0 terms.
|
README.md
ADDED
|
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
pipeline_tag: video-classification
|
| 5 |
+
tags:
|
| 6 |
+
- birds
|
| 7 |
+
- swifts
|
| 8 |
+
- MViTv2
|
| 9 |
+
- Ballinrobe
|
| 10 |
+
license: other
|
| 11 |
+
license_name: bcs-lcs
|
| 12 |
+
license_link: LICENSE
|
| 13 |
+
base_model:
|
| 14 |
+
- timm/mvitv2_small.fb_in1k
|
| 15 |
+
library_name: transformers
|
| 16 |
+
datasets:
|
| 17 |
+
- odinglynn/swift-150
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
# SwiftViT-150
|
| 21 |
+
|
| 22 |
+
MViT-v2 fine-tuned on 150 videos for common swift feeding behavior classification.
|
| 23 |
+
|
| 24 |
+
## Model
|
| 25 |
+
|
| 26 |
+
Fine-tuned `mvit_v2_s` (Kinetics-400 pretrained) on single-camera nestbox footage. Achieves ~87% validation accuracy (in controlled settings) and demonstrates surprising cross-camera generalization despite training on a single viewpoint and on a miniscule dataset (150 samples).
|
| 27 |
+
|
| 28 |
+
## Usage
|
| 29 |
+
```python
|
| 30 |
+
import torch
|
| 31 |
+
import torchvision
|
| 32 |
+
|
| 33 |
+
model = torchvision.models.video.mvit_v2_s(weights=None)
|
| 34 |
+
model.head = torch.nn.Sequential(
|
| 35 |
+
torch.nn.Dropout(0.5),
|
| 36 |
+
torch.nn.Linear(768, 512),
|
| 37 |
+
torch.nn.GELU(),
|
| 38 |
+
torch.nn.Dropout(0.3),
|
| 39 |
+
torch.nn.Linear(512, 3),
|
| 40 |
+
)
|
| 41 |
+
|
| 42 |
+
checkpoint = torch.load("swiftvit-150.pth")
|
| 43 |
+
model.load_state_dict(checkpoint["model_state_dict"])
|
| 44 |
+
model.eval()
|
| 45 |
+
|
| 46 |
+
# Inference
|
| 47 |
+
with torch.no_grad():
|
| 48 |
+
video = load_video() # Shape: [C, T, H, W]
|
| 49 |
+
output = model(video.unsqueeze(0))
|
| 50 |
+
prediction = torch.argmax(output, dim=1)
|
| 51 |
+
# 0: feeding, 1: possible_feeding, 2: not_feeding
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
## Architecture
|
| 55 |
+
|
| 56 |
+
- Base: MViT-v2 Small (24M params)
|
| 57 |
+
- Head: Custom 768→512→3 with dropout
|
| 58 |
+
- Input: 16 frames @ 224x224
|
| 59 |
+
- Classes: 3 (feeding, possible_feeding, not_feeding)
|
| 60 |
+
|
| 61 |
+
## Training
|
| 62 |
+
|
| 63 |
+
- 120 train / 30 val samples
|
| 64 |
+
- Batch size: 4
|
| 65 |
+
- Optimizer: AdamW (lr=1e-4, wd=0.05)
|
| 66 |
+
- Scheduler: CosineAnnealingWarmRestarts
|
| 67 |
+
- Mixed precision training on H100
|
| 68 |
+
- Early stopping: 40 epoch patience
|
| 69 |
+
|
| 70 |
+
## Performance
|
| 71 |
+
|
| 72 |
+
- Train accuracy: 100%
|
| 73 |
+
- Val accuracy: 87%
|
| 74 |
+
- Unexpected cross-camera generalization observed
|
| 75 |
+
|
| 76 |
+
## Dataset
|
| 77 |
+
|
| 78 |
+
Trained on [swift-150](https://huggingface.co/datasets/odinglynn/swift-150) - 150 videos from GABLE nestbox camera (Ireland, 2020-2025).
|
| 79 |
+
|
| 80 |
+
## Context
|
| 81 |
+
|
| 82 |
+
Part of climate research correlating swift feeding patterns with weather data at terrabyte scale. Ballinrobe Community School entry for REDACTED.
|
| 83 |
+
|
| 84 |
+
## Citation
|
| 85 |
+
|
| 86 |
+
If you reference this work, cite:
|
| 87 |
+
```bibtex
|
| 88 |
+
@misc{swift150bcs,
|
| 89 |
+
title={Swift-150: A Dataset for Common Swift Feeding Behavior Analysis},
|
| 90 |
+
author={Odin Glynn-Martin, Culan O'Meara, Anas Rashid, Shayden D'Souza, Pádraig Foley and Mark Lally},
|
| 91 |
+
year={2025},
|
| 92 |
+
institution={Ballinrobe Community School},
|
| 93 |
+
url={https://ballinrobecommunityschool.ie},
|
| 94 |
+
note={REDACTED - Entry 2025}
|
| 95 |
+
}
|
| 96 |
+
```
|
| 97 |
+
|
| 98 |
+
## License
|
| 99 |
+
|
| 100 |
+
Proprietary. See LICENSE for restrictions.
|
swiftvit-150.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a186210e7b6f067155e3a1642f4891386ebc7c6eeafface48775c7d056b2191a
|
| 3 |
+
size 415996750
|