Makki2104 commited on
Commit
0ba50d5
·
verified ·
1 Parent(s): d2f91cd

Add files using upload-large-folder tool

Browse files
convnextv2_huge.dbv4-full/README.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - timm
5
+ - transformers
6
+ - animetimm
7
+ - dghs-imgutils
8
+ library_name: timm
9
+ license: gpl-3.0
10
+ datasets:
11
+ - animetimm/danbooru-wdtagger-v4-w640-ws-full
12
+ base_model:
13
+ - timm/convnextv2_huge.fcmae_ft_in22k_in1k_512
14
+ ---
15
+
16
+ # Anime Tagger convnextv2_huge.dbv4-full
17
+
18
+ ## Model Details
19
+
20
+ - **Model Type:** Multilabel Image classification / feature backbone
21
+ - **Model Stats:**
22
+ - Params: 692.6M
23
+ - FLOPs / MACs: 1.2T / 600.4G
24
+ - Image size: train = 512 x 512, test = 512 x 512
25
+ - **Dataset:** [animetimm/danbooru-wdtagger-v4-w640-ws-full](https://huggingface.co/datasets/animetimm/danbooru-wdtagger-v4-w640-ws-full)
26
+ - Tags Count: 12476
27
+ - General (#0) Tags Count: 9225
28
+ - Character (#4) Tags Count: 3247
29
+ - Rating (#9) Tags Count: 4
30
+
31
+ ## Results
32
+
33
+ | # | Macro@0.40 (F1/MCC/P/R) | Micro@0.40 (F1/MCC/P/R) | Macro@Best (F1/P/R) |
34
+ |:----------:|:-----------------------------:|:-----------------------------:|:---------------------:|
35
+ | Validation | 0.580 / 0.584 / 0.626 / 0.556 | 0.697 / 0.696 / 0.692 / 0.701 | --- |
36
+ | Test | 0.580 / 0.584 / 0.627 / 0.556 | 0.697 / 0.696 / 0.693 / 0.702 | 0.611 / 0.612 / 0.630 |
37
+
38
+ * `Macro/Micro@0.40` means the metrics on the threshold 0.40.
39
+ * `Macro@Best` means the mean metrics on the tag-level thresholds on each tags, which should have the best F1 scores.
40
+
41
+ ## Thresholds
42
+
43
+ | Category | Name | Alpha | Threshold | Micro@Thr (F1/P/R) | Macro@0.40 (F1/P/R) | Macro@Best (F1/P/R) |
44
+ |:----------:|:---------:|:-------:|:-----------:|:---------------------:|:---------------------:|:---------------------:|
45
+ | 0 | general | 1 | 0.38 | 0.685 / 0.673 / 0.697 | 0.457 / 0.514 / 0.430 | 0.494 / 0.490 / 0.524 |
46
+ | 4 | character | 1 | 0.51 | 0.946 / 0.962 / 0.930 | 0.930 / 0.948 / 0.915 | 0.943 / 0.959 / 0.930 |
47
+ | 9 | rating | 1 | 0.24 | 0.828 / 0.790 / 0.871 | 0.833 / 0.823 / 0.843 | 0.835 / 0.812 / 0.861 |
48
+
49
+ * `Micro@Thr` means the metrics on the category-level suggested thresholds, which are listed in the table above.
50
+ * `Macro@0.40` means the metrics on the threshold 0.40.
51
+ * `Macro@Best` means the metrics on the tag-level thresholds on each tags, which should have the best F1 scores.
52
+
53
+ For tag-level thresholds, you can find them in [selected_tags.csv](https://huggingface.co/animetimm/convnextv2_huge.dbv4-full/resolve/main/selected_tags.csv).
54
+
55
+ ## How to Use
56
+
57
+ We provided a sample image for our code samples, you can find it [here](https://huggingface.co/animetimm/convnextv2_huge.dbv4-full/blob/main/sample.webp).
58
+
59
+ ### Use TIMM And Torch
60
+
61
+ Install [dghs-imgutils](https://github.com/deepghs/imgutils), [timm](https://github.com/huggingface/pytorch-image-models) and other necessary requirements with the following command
62
+
63
+ ```shell
64
+ pip install 'dghs-imgutils>=0.19.0' torch huggingface_hub timm pillow pandas
65
+ ```
66
+
67
+ After that you can load this model with timm library, and use it for train, validation and test, with the following code
68
+
69
+ ```python
70
+ import json
71
+
72
+ import pandas as pd
73
+ import torch
74
+ from huggingface_hub import hf_hub_download
75
+ from imgutils.data import load_image
76
+ from imgutils.preprocess import create_torchvision_transforms
77
+ from timm import create_model
78
+
79
+ repo_id = 'animetimm/convnextv2_huge.dbv4-full'
80
+ model = create_model(f'hf-hub:{repo_id}', pretrained=True)
81
+ model.eval()
82
+
83
+ with open(hf_hub_download(repo_id=repo_id, repo_type='model', filename='preprocess.json'), 'r') as f:
84
+ preprocessor = create_torchvision_transforms(json.load(f)['test'])
85
+ # Compose(
86
+ # PadToSize(size=(512, 512), interpolation=bilinear, background_color=white)
87
+ # Resize(size=(512, 512), interpolation=bicubic, max_size=None, antialias=True)
88
+ # CenterCrop(size=[512, 512])
89
+ # MaybeToTensor()
90
+ # Normalize(mean=tensor([0.4850, 0.4560, 0.4060]), std=tensor([0.2290, 0.2240, 0.2250]))
91
+ # )
92
+
93
+ image = load_image('https://huggingface.co/animetimm/convnextv2_huge.dbv4-full/resolve/main/sample.webp')
94
+ input_ = preprocessor(image).unsqueeze(0)
95
+ # input_, shape: torch.Size([1, 3, 512, 512]), dtype: torch.float32
96
+ with torch.no_grad():
97
+ output = model(input_)
98
+ prediction = torch.sigmoid(output)[0]
99
+ # output, shape: torch.Size([1, 12476]), dtype: torch.float32
100
+ # prediction, shape: torch.Size([12476]), dtype: torch.float32
101
+
102
+ df_tags = pd.read_csv(
103
+ hf_hub_download(repo_id=repo_id, repo_type='model', filename='selected_tags.csv'),
104
+ keep_default_na=False
105
+ )
106
+ tags = df_tags['name']
107
+ mask = prediction.numpy() >= df_tags['best_threshold']
108
+ print(dict(zip(tags[mask].tolist(), prediction[mask].tolist())))
109
+ # {'sensitive': 0.9900546073913574,
110
+ # '1girl': 0.9986221790313721,
111
+ # 'solo': 0.9894072413444519,
112
+ # 'looking_at_viewer': 0.8689708113670349,
113
+ # 'blush': 0.8729097843170166,
114
+ # 'smile': 0.9395995736122131,
115
+ # 'short_hair': 0.6831153631210327,
116
+ # 'long_sleeves': 0.6779903173446655,
117
+ # 'brown_hair': 0.802174985408783,
118
+ # 'holding': 0.3276722729206085,
119
+ # 'dress': 0.6280677318572998,
120
+ # 'sitting': 0.6450996994972229,
121
+ # 'purple_eyes': 0.8072393536567688,
122
+ # 'flower': 0.9524818062782288,
123
+ # 'braid': 0.8764650225639343,
124
+ # 'outdoors': 0.47000938653945923,
125
+ # 'tears': 0.9879008531570435,
126
+ # 'floral_print': 0.5994200706481934,
127
+ # 'crying': 0.34614139795303345,
128
+ # 'plant': 0.3870095908641815,
129
+ # 'crown_braid': 0.7048561573028564,
130
+ # 'happy_tears': 0.759681224822998,
131
+ # 'pavement': 0.2870482802391052,
132
+ # 'wiping_tears': 0.9898664951324463,
133
+ # 'brick_floor': 0.5737900137901306}
134
+ ```
135
+
136
+ ## Citation
137
+
138
+ ```
139
+ @misc{convnextv2_huge_dbv4_full,
140
+ title = {Anime Tagger convnextv2_huge.dbv4-full},
141
+ author = {narugo1992 and Deep Generative anime Hobbyist Syndicate (DeepGHS)},
142
+ year = {2025},
143
+ howpublished = {\url{https://huggingface.co/animetimm/convnextv2_huge.dbv4-full}},
144
+ note = {A large-scale anime-style image classification model based on convnextv2_huge architecture for multi-label tagging with 12476 tags, trained on anime dataset dbv4-full (\url{https://huggingface.co/datasets/animetimm/danbooru-wdtagger-v4-w640-ws-full}). Model parameters: 692.6M, FLOPs: 1.2T, input resolution: 512×512.},
145
+ license = {gpl-3.0}
146
+ }
147
+ ```
148
+
convnextv2_huge.dbv4-full/preprocess.json ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "pre": [
3
+ {
4
+ "background_color": "white",
5
+ "interpolation": "bilinear",
6
+ "size": [
7
+ 512,
8
+ 512
9
+ ],
10
+ "type": "pad_to_size"
11
+ }
12
+ ],
13
+ "test": [
14
+ {
15
+ "background_color": "white",
16
+ "interpolation": "bilinear",
17
+ "size": [
18
+ 512,
19
+ 512
20
+ ],
21
+ "type": "pad_to_size"
22
+ },
23
+ {
24
+ "antialias": true,
25
+ "interpolation": "bicubic",
26
+ "max_size": null,
27
+ "size": [
28
+ 512,
29
+ 512
30
+ ],
31
+ "type": "resize"
32
+ },
33
+ {
34
+ "size": [
35
+ 512,
36
+ 512
37
+ ],
38
+ "type": "center_crop"
39
+ },
40
+ {
41
+ "type": "maybe_to_tensor"
42
+ },
43
+ {
44
+ "mean": [
45
+ 0.48500001430511475,
46
+ 0.4560000002384186,
47
+ 0.4059999883174896
48
+ ],
49
+ "std": [
50
+ 0.2290000021457672,
51
+ 0.2240000069141388,
52
+ 0.22499999403953552
53
+ ],
54
+ "type": "normalize"
55
+ }
56
+ ],
57
+ "val": [
58
+ {
59
+ "background_color": "white",
60
+ "interpolation": "bilinear",
61
+ "size": [
62
+ 512,
63
+ 512
64
+ ],
65
+ "type": "pad_to_size"
66
+ },
67
+ {
68
+ "antialias": true,
69
+ "interpolation": "bicubic",
70
+ "max_size": null,
71
+ "size": [
72
+ 512,
73
+ 512
74
+ ],
75
+ "type": "resize"
76
+ },
77
+ {
78
+ "size": [
79
+ 512,
80
+ 512
81
+ ],
82
+ "type": "center_crop"
83
+ },
84
+ {
85
+ "type": "maybe_to_tensor"
86
+ },
87
+ {
88
+ "mean": [
89
+ 0.48500001430511475,
90
+ 0.4560000002384186,
91
+ 0.4059999883174896
92
+ ],
93
+ "std": [
94
+ 0.2290000021457672,
95
+ 0.2240000069141388,
96
+ 0.22499999403953552
97
+ ],
98
+ "type": "normalize"
99
+ }
100
+ ]
101
+ }
convnextv2_huge.dbv4-full/sample.webp ADDED
convnextv2_huge.dbv4-full/selected_tags.csv ADDED
The diff for this file is too large to render. See raw diff
 
convnextv2_huge.dbv4-full/thresholds.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ category,name,alpha,threshold,f1,precision,recall
2
+ 0,general,1.0,0.38,0.685090329194269,0.6732974998130495,0.6973036267335329
3
+ 4,character,1.0,0.51,0.9457540360090104,0.9618129946021976,0.930222529196658
4
+ 9,rating,1.0,0.24000000000000002,0.828248843557246,0.7895431723985823,0.8709450692041523
resnet101.dbv4-full/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
resnet101.dbv4-full/README.md ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - timm
5
+ - transformers
6
+ - animetimm
7
+ - dghs-imgutils
8
+ library_name: timm
9
+ license: gpl-3.0
10
+ datasets:
11
+ - animetimm/danbooru-wdtagger-v4-w640-ws-full
12
+ base_model:
13
+ - timm/resnet101.tv_in1k
14
+ ---
15
+
16
+ # Anime Tagger resnet101.dbv4-full
17
+
18
+ ## Model Details
19
+
20
+ - **Model Type:** Multilabel Image classification / feature backbone
21
+ - **Model Stats:**
22
+ - Params: 68.1M
23
+ - FLOPs / MACs: 46.0G / 22.9G
24
+ - Image size: train = 384 x 384, test = 384 x 384
25
+ - **Dataset:** [animetimm/danbooru-wdtagger-v4-w640-ws-full](https://huggingface.co/datasets/animetimm/danbooru-wdtagger-v4-w640-ws-full)
26
+ - Tags Count: 12476
27
+ - General (#0) Tags Count: 9225
28
+ - Character (#4) Tags Count: 3247
29
+ - Rating (#9) Tags Count: 4
30
+
31
+ ## Results
32
+
33
+ | # | Macro@0.40 (F1/MCC/P/R) | Micro@0.40 (F1/MCC/P/R) | Macro@Best (F1/P/R) |
34
+ |:----------:|:-----------------------------:|:-----------------------------:|:---------------------:|
35
+ | Validation | 0.436 / 0.448 / 0.535 / 0.395 | 0.622 / 0.622 / 0.672 / 0.578 | --- |
36
+ | Test | 0.437 / 0.448 / 0.535 / 0.396 | 0.622 / 0.623 / 0.672 / 0.579 | 0.481 / 0.509 / 0.482 |
37
+
38
+ * `Macro/Micro@0.40` means the metrics on the threshold 0.40.
39
+ * `Macro@Best` means the mean metrics on the tag-level thresholds on each tags, which should have the best F1 scores.
40
+
41
+ ## Thresholds
42
+
43
+ | Category | Name | Alpha | Threshold | Micro@Thr (F1/P/R) | Macro@0.40 (F1/P/R) | Macro@Best (F1/P/R) |
44
+ |:----------:|:---------:|:-------:|:-----------:|:---------------------:|:---------------------:|:---------------------:|
45
+ | 0 | general | 1 | 0.33 | 0.612 / 0.619 / 0.605 | 0.305 / 0.421 / 0.262 | 0.357 / 0.374 / 0.374 |
46
+ | 4 | character | 1 | 0.49 | 0.845 / 0.906 / 0.791 | 0.812 / 0.858 / 0.777 | 0.833 / 0.893 / 0.789 |
47
+ | 9 | rating | 1 | 0.4 | 0.800 / 0.755 / 0.851 | 0.805 / 0.778 / 0.837 | 0.806 / 0.771 / 0.848 |
48
+
49
+ * `Micro@Thr` means the metrics on the category-level suggested thresholds, which are listed in the table above.
50
+ * `Macro@0.40` means the metrics on the threshold 0.40.
51
+ * `Macro@Best` means the metrics on the tag-level thresholds on each tags, which should have the best F1 scores.
52
+
53
+ For tag-level thresholds, you can find them in [selected_tags.csv](https://huggingface.co/animetimm/resnet101.dbv4-full/resolve/main/selected_tags.csv).
54
+
55
+ ## How to Use
56
+
57
+ We provided a sample image for our code samples, you can find it [here](https://huggingface.co/animetimm/resnet101.dbv4-full/blob/main/sample.webp).
58
+
59
+ ### Use TIMM And Torch
60
+
61
+ Install [dghs-imgutils](https://github.com/deepghs/imgutils), [timm](https://github.com/huggingface/pytorch-image-models) and other necessary requirements with the following command
62
+
63
+ ```shell
64
+ pip install 'dghs-imgutils>=0.17.0' torch huggingface_hub timm pillow pandas
65
+ ```
66
+
67
+ After that you can load this model with timm library, and use it for train, validation and test, with the following code
68
+
69
+ ```python
70
+ import json
71
+
72
+ import pandas as pd
73
+ import torch
74
+ from huggingface_hub import hf_hub_download
75
+ from imgutils.data import load_image
76
+ from imgutils.preprocess import create_torchvision_transforms
77
+ from timm import create_model
78
+
79
+ repo_id = 'animetimm/resnet101.dbv4-full'
80
+ model = create_model(f'hf-hub:{repo_id}', pretrained=True)
81
+ model.eval()
82
+
83
+ with open(hf_hub_download(repo_id=repo_id, repo_type='model', filename='preprocess.json'), 'r') as f:
84
+ preprocessor = create_torchvision_transforms(json.load(f)['test'])
85
+ # Compose(
86
+ # PadToSize(size=(512, 512), interpolation=bilinear, background_color=white)
87
+ # Resize(size=384, interpolation=bilinear, max_size=None, antialias=True)
88
+ # CenterCrop(size=[384, 384])
89
+ # MaybeToTensor()
90
+ # Normalize(mean=tensor([0.4850, 0.4560, 0.4060]), std=tensor([0.2290, 0.2240, 0.2250]))
91
+ # )
92
+
93
+ image = load_image('https://huggingface.co/animetimm/resnet101.dbv4-full/resolve/main/sample.webp')
94
+ input_ = preprocessor(image).unsqueeze(0)
95
+ # input_, shape: torch.Size([1, 3, 384, 384]), dtype: torch.float32
96
+ with torch.no_grad():
97
+ output = model(input_)
98
+ prediction = torch.sigmoid(output)[0]
99
+ # output, shape: torch.Size([1, 12476]), dtype: torch.float32
100
+ # prediction, shape: torch.Size([12476]), dtype: torch.float32
101
+
102
+ df_tags = pd.read_csv(
103
+ hf_hub_download(repo_id=repo_id, repo_type='model', filename='selected_tags.csv'),
104
+ keep_default_na=False
105
+ )
106
+ tags = df_tags['name']
107
+ mask = prediction.numpy() >= df_tags['best_threshold']
108
+ print(dict(zip(tags[mask].tolist(), prediction[mask].tolist())))
109
+ # {'general': 0.5100178718566895,
110
+ # 'sensitive': 0.5034157037734985,
111
+ # '1girl': 0.9962267875671387,
112
+ # 'solo': 0.9669082760810852,
113
+ # 'looking_at_viewer': 0.8127952814102173,
114
+ # 'blush': 0.7912614941596985,
115
+ # 'smile': 0.9032713770866394,
116
+ # 'short_hair': 0.7837649583816528,
117
+ # 'shirt': 0.5146411657333374,
118
+ # 'long_sleeves': 0.7224600315093994,
119
+ # 'brown_hair': 0.5260339379310608,
120
+ # 'holding': 0.5752436518669128,
121
+ # 'dress': 0.5642756223678589,
122
+ # 'closed_mouth': 0.4826013743877411,
123
+ # 'purple_eyes': 0.7590888142585754,
124
+ # 'flower': 0.9180877208709717,
125
+ # 'braid': 0.9453270435333252,
126
+ # 'red_hair': 0.8512048721313477,
127
+ # 'blunt_bangs': 0.5289319753646851,
128
+ # 'bob_cut': 0.22592417895793915,
129
+ # 'plant': 0.5463797450065613,
130
+ # 'blue_flower': 0.6992892026901245,
131
+ # 'crown_braid': 0.7925195097923279,
132
+ # 'potted_plant': 0.5136846899986267,
133
+ # 'flower_pot': 0.4357028007507324,
134
+ # 'wiping_tears': 0.3059103488922119}
135
+ ```
136
+ ### Use ONNX Model For Inference
137
+
138
+ Install [dghs-imgutils](https://github.com/deepghs/imgutils) with the following command
139
+
140
+ ```shell
141
+ pip install 'dghs-imgutils>=0.17.0'
142
+ ```
143
+
144
+ Use `multilabel_timm_predict` function with the following code
145
+
146
+ ```python
147
+ from imgutils.generic import multilabel_timm_predict
148
+
149
+ general, character, rating = multilabel_timm_predict(
150
+ 'https://huggingface.co/animetimm/resnet101.dbv4-full/resolve/main/sample.webp',
151
+ repo_id='animetimm/resnet101.dbv4-full',
152
+ fmt=('general', 'character', 'rating'),
153
+ )
154
+
155
+ print(general)
156
+ # {'1girl': 0.9962266683578491,
157
+ # 'solo': 0.96690833568573,
158
+ # 'braid': 0.9453268647193909,
159
+ # 'flower': 0.9180880784988403,
160
+ # 'smile': 0.9032710790634155,
161
+ # 'red_hair': 0.8512046337127686,
162
+ # 'looking_at_viewer': 0.8127949833869934,
163
+ # 'crown_braid': 0.792519211769104,
164
+ # 'blush': 0.7912609577178955,
165
+ # 'short_hair': 0.7837648391723633,
166
+ # 'purple_eyes': 0.7590886354446411,
167
+ # 'long_sleeves': 0.7224597930908203,
168
+ # 'blue_flower': 0.6992897391319275,
169
+ # 'holding': 0.5752434134483337,
170
+ # 'dress': 0.5642745494842529,
171
+ # 'plant': 0.5463811755180359,
172
+ # 'blunt_bangs': 0.5289315581321716,
173
+ # 'brown_hair': 0.5260326862335205,
174
+ # 'shirt': 0.5146413445472717,
175
+ # 'potted_plant': 0.5136858820915222,
176
+ # 'closed_mouth': 0.48260119557380676,
177
+ # 'flower_pot': 0.4357031583786011,
178
+ # 'wiping_tears': 0.30590835213661194,
179
+ # 'bob_cut': 0.22592449188232422}
180
+ print(character)
181
+ # {}
182
+ print(rating)
183
+ # {'general': 0.5100165009498596, 'sensitive': 0.5034170150756836}
184
+ ```
185
+
186
+ For further information, see [documentation of function multilabel_timm_predict](https://dghs-imgutils.deepghs.org/main/api_doc/generic/multilabel_timm.html#multilabel-timm-predict).
187
+
resnet101.dbv4-full/categories.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "category": 0,
4
+ "name": "general"
5
+ },
6
+ {
7
+ "category": 4,
8
+ "name": "character"
9
+ },
10
+ {
11
+ "category": 9,
12
+ "name": "rating"
13
+ }
14
+ ]
resnet101.dbv4-full/config.json ADDED
The diff for this file is too large to render. See raw diff
 
resnet101.dbv4-full/meta.json ADDED
The diff for this file is too large to render. See raw diff
 
resnet101.dbv4-full/metrics.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "test": {
3
+ "macro_f1": 0.4368686378002167,
4
+ "macro_mcc": 0.4481600224971771,
5
+ "macro_precision": 0.5345199108123779,
6
+ "macro_recall": 0.3957759737968445,
7
+ "micro_f1": 0.621998131275177,
8
+ "micro_mcc": 0.6228444576263428,
9
+ "micro_precision": 0.6722905039787292,
10
+ "micro_recall": 0.5787064433097839
11
+ },
12
+ "val": {
13
+ "learning_rate": 4.7306720809906175e-06,
14
+ "loss": 0.40296348299079054,
15
+ "macro_f1": 0.4364077150821686,
16
+ "macro_mcc": 0.447799950838089,
17
+ "macro_precision": 0.5347463488578796,
18
+ "macro_recall": 0.3953515291213989,
19
+ "micro_f1": 0.6215192675590515,
20
+ "micro_mcc": 0.6223735809326172,
21
+ "micro_precision": 0.6719217300415039,
22
+ "micro_recall": 0.578150749206543,
23
+ "step": 93
24
+ }
25
+ }
resnet101.dbv4-full/preprocess.json ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "pre": [
3
+ {
4
+ "background_color": "white",
5
+ "interpolation": "bilinear",
6
+ "size": [
7
+ 512,
8
+ 512
9
+ ],
10
+ "type": "pad_to_size"
11
+ }
12
+ ],
13
+ "test": [
14
+ {
15
+ "background_color": "white",
16
+ "interpolation": "bilinear",
17
+ "size": [
18
+ 512,
19
+ 512
20
+ ],
21
+ "type": "pad_to_size"
22
+ },
23
+ {
24
+ "antialias": true,
25
+ "interpolation": "bilinear",
26
+ "max_size": null,
27
+ "size": 384,
28
+ "type": "resize"
29
+ },
30
+ {
31
+ "size": [
32
+ 384,
33
+ 384
34
+ ],
35
+ "type": "center_crop"
36
+ },
37
+ {
38
+ "type": "maybe_to_tensor"
39
+ },
40
+ {
41
+ "mean": [
42
+ 0.48500001430511475,
43
+ 0.4560000002384186,
44
+ 0.4059999883174896
45
+ ],
46
+ "std": [
47
+ 0.2290000021457672,
48
+ 0.2240000069141388,
49
+ 0.22499999403953552
50
+ ],
51
+ "type": "normalize"
52
+ }
53
+ ],
54
+ "val": [
55
+ {
56
+ "background_color": "white",
57
+ "interpolation": "bilinear",
58
+ "size": [
59
+ 512,
60
+ 512
61
+ ],
62
+ "type": "pad_to_size"
63
+ },
64
+ {
65
+ "antialias": true,
66
+ "interpolation": "bilinear",
67
+ "max_size": null,
68
+ "size": 384,
69
+ "type": "resize"
70
+ },
71
+ {
72
+ "size": [
73
+ 384,
74
+ 384
75
+ ],
76
+ "type": "center_crop"
77
+ },
78
+ {
79
+ "type": "maybe_to_tensor"
80
+ },
81
+ {
82
+ "mean": [
83
+ 0.48500001430511475,
84
+ 0.4560000002384186,
85
+ 0.4059999883174896
86
+ ],
87
+ "std": [
88
+ 0.2290000021457672,
89
+ 0.2240000069141388,
90
+ 0.22499999403953552
91
+ ],
92
+ "type": "normalize"
93
+ }
94
+ ]
95
+ }
resnet101.dbv4-full/sample.webp ADDED
resnet101.dbv4-full/selected_tags.csv ADDED
The diff for this file is too large to render. See raw diff
 
resnet101.dbv4-full/thresholds.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ category,name,alpha,threshold,f1,precision,recall
2
+ 0,general,1.0,0.33,0.6117394521110506,0.6189713640648628,0.604674580327153
3
+ 4,character,1.0,0.49,0.844884846805129,0.9064400543098392,0.7911582800403905
4
+ 9,rating,1.0,0.4,0.8004352660836954,0.7552656981659241,0.8513513513513513
resnet152.dbv4-full/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
resnet152.dbv4-full/categories.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "category": 0,
4
+ "name": "general"
5
+ },
6
+ {
7
+ "category": 4,
8
+ "name": "character"
9
+ },
10
+ {
11
+ "category": 9,
12
+ "name": "rating"
13
+ }
14
+ ]
resnet152.dbv4-full/config.json ADDED
The diff for this file is too large to render. See raw diff
 
swinv2_base_window8_256.dbv4a-full/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1dcb080ae4db05b3e3cce367cb81530cc5f3dbe8c1b8308bd2dbb2bc471c844e
3
+ size 350402567
vit_base_patch16_224.dbv4-full/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69b0244f83fd18c74213cf2761aca60e7906a917843bc4997accf10a6489f0f9
3
+ size 383428479