donDEN05 cermakvo commited on
Commit
00080a8
·
0 Parent(s):

Duplicate from BVRA/MegaDescriptor-L-384

Browse files

Co-authored-by: V Cermak <cermakvo@users.noreply.huggingface.co>

Files changed (4) hide show
  1. .gitattributes +35 -0
  2. README.md +63 -0
  3. config.json +36 -0
  4. pytorch_model.bin +3 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - ecology
5
+ - animals
6
+ - re-identification
7
+ library_name: wildlife-datasets
8
+ license: cc-by-nc-4.0
9
+ ---
10
+ # Model card for MegaDescriptor-L-384
11
+
12
+ A Swin-L image feature model. Superwisely pre-trained on animal re-identification datasets.
13
+
14
+
15
+ ## Model Details
16
+ - **Model Type:** Animal re-identification / feature backbone
17
+ - **Model Stats:**
18
+ - Params (M): 228.8
19
+ - Image size: 384 x 384
20
+ - Architecture: swin_large_patch4_window12_384
21
+ - **Paper:** [WildlifeDatasets_An_Open-Source_Toolkit_for_Animal_Re-Identification](https://openaccess.thecvf.com/content/WACV2024/html/Cermak_WildlifeDatasets_An_Open-Source_Toolkit_for_Animal_Re-Identification_WACV_2024_paper.html)
22
+ - **Related Papers:**
23
+ - [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)
24
+ - [DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/pdf/2304.07193.pdf)
25
+ - **Pretrain Dataset:** All available re-identification datasets --> https://github.com/WildlifeDatasets/wildlife-datasets
26
+
27
+ ## Model Usage
28
+ ### Image Embeddings
29
+ ```python
30
+
31
+ import timm
32
+ import torch
33
+ import torchvision.transforms as T
34
+
35
+ from PIL import Image
36
+ from urllib.request import urlopen
37
+
38
+ model = timm.create_model("hf-hub:BVRA/MegaDescriptor-L-384", pretrained=True)
39
+ model = model.eval()
40
+
41
+ train_transforms = T.Compose([T.Resize(size=(384, 384)),
42
+ T.ToTensor(),
43
+ T.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
44
+
45
+ img = Image.open(urlopen(
46
+ 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
47
+ ))
48
+
49
+ output = model(train_transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor
50
+ # output is a (1, num_features) shaped tensor
51
+ ```
52
+
53
+ ## Citation
54
+
55
+ ```bibtex
56
+ @inproceedings{vcermak2024wildlifedatasets,
57
+ title={WildlifeDatasets: An open-source toolkit for animal re-identification},
58
+ author={{\v{C}}erm{\'a}k, Vojt{\v{e}}ch and Picek, Lukas and Adam, Luk{\'a}{\v{s}} and Papafitsoros, Kostas},
59
+ booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
60
+ pages={5953--5963},
61
+ year={2024}
62
+ }
63
+ ```
config.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architecture": "swin_large_patch4_window12_384",
3
+ "num_classes": 0,
4
+ "num_features": 1536,
5
+ "global_pool": "avg",
6
+ "pretrained_cfg": {
7
+ "custom_load": false,
8
+ "input_size": [
9
+ 3,
10
+ 384,
11
+ 384
12
+ ],
13
+ "fixed_input_size": true,
14
+ "interpolation": "bicubic",
15
+ "crop_pct": 0.9,
16
+ "crop_mode": "center",
17
+ "mean": [
18
+ 0.485,
19
+ 0.456,
20
+ 0.406
21
+ ],
22
+ "std": [
23
+ 0.229,
24
+ 0.224,
25
+ 0.225
26
+ ],
27
+ "num_classes": 0,
28
+ "pool_size": [
29
+ 7,
30
+ 7
31
+ ],
32
+ "first_conv": "patch_embed.proj",
33
+ "license": "mit",
34
+ "classifier": "head"
35
+ }
36
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ccfe757f50f7984a115ffe00921cd4c09e260e645215463b23814586040227a3
3
+ size 1935177938