karma689 commited on
Commit
7d64bbf
·
verified ·
1 Parent(s): b1a27df

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,99 +1,73 @@
1
  ---
2
- language:
3
- - bo
4
  license: apache-2.0
5
- library_name: transformers
6
  tags:
7
- - image-classification
8
- - dinov3
9
- - tibetan
10
- - manuscript
11
- - binary-classification
12
- - vision
13
- datasets:
14
- - openpecha/uchen-ume-classification-benchmark
15
- metrics:
16
- - accuracy
17
- - f1
18
- - auc_roc
19
- base_model: facebook/dinov3-vits16-pretrain-lvd1689m
20
  ---
21
 
22
- # Tibetan Script Router (DINOv3-ViT-S)
23
 
24
- This model is a fine-tuned version of **Meta's DINOv3-ViT-S/16** specifically designed for high-precision binary classification of Tibetan scripts. It acts as the primary "Router" in a hierarchical classification pipeline, distinguishing between formal block scripts (**Uchen**) and cursive families (**Ume**).
25
 
26
- ## Model Details
27
 
28
- - **Project Name:** The BDRC Etext Corpus
29
- - **Developed by:** Dharmaduta
30
- - **Specifications provided by:** [Buddhist Digital Resource Center (BDRC)](https://www.bdrc.io)
31
- - **Funded by:** Khyentse Foundation
32
- - **Model type:** Vision Transformer (ViT)
33
- - **License:** Apache 2.0
34
- - **Fine-tuned from:** `facebook/dinov3-vits16-pretrain-lvd1689m`
35
 
36
- ## Dataset & Class Distribution
 
 
 
37
 
38
- The model was trained using the [openpecha/uchen-ume-classification](https://huggingface.co/datasets/openpecha/uchen-ume-classification) dataset. This training set consists of **4,572 images** balanced across two major categories.
39
 
40
- The binary classes were mapped from the following granular script types:
41
 
42
- ### 1. Uchen (Class 0) — 2,286 Total Samples
43
- | Granular Script Type | Sample Count |
44
- | :--- | :--- |
45
- | `uchen_sugdring` | 1,670 |
46
- | `uchen_sugthung` | 616 |
47
 
48
- ### 2. Ume (Class 1) 2,286 Total Samples
49
- | Granular Script Type | Sample Count |
50
- | :--- | :--- |
51
- | `petsuk` | 1,388 |
52
- | `tsegdrig` | 749 |
53
- | `peri` | 614 |
54
- | `druthung` | 207 |
55
- | `tsumachug` | 178 |
56
- | `yigchung` | 166 |
57
- | `drudring` | 132 |
58
- | `drathung` | 129 |
59
- | `druring` | 119 |
60
- | `khyuyig` | 113 |
61
- | `dhumri` | 98 |
62
- | `tsugchung` | 77 |
63
- | `trinyig` | 42 |
64
 
65
- *Note: Classes labeled "Difficult," "Multi-script," and "Non-Tibetan" were excluded to maintain a clean training signal for the Uchen/Ume boundary.*
66
 
67
- ## Performance Summary
68
- The model achieved its peak performance at **Stage B** (Partial backbone unfreezing of the last 2 blocks).
 
 
 
69
 
70
- - **Test Accuracy:** 98.95%
71
- - **Macro F1-Score:** 0.984
72
- - **AUC-ROC:** 0.9988
73
 
74
- ### Confusion Matrix
75
- | Predicted \ Actual | Uchen | Ume |
76
- |--------------------|-------|-----|
77
- | **Uchen** | 159 | 2 |
78
- | **Ume** | 6 | 595 |
79
-
80
- ## How to Get Started
81
 
82
  ```python
83
- from transformers import AutoImageProcessor, AutoModelForImageClassification
84
  import torch
 
 
85
  from PIL import Image
86
 
87
- # Note: Gated access approval for DINOv3 is required
88
- model_id = "openpecha/uchen-ume-classifier"
89
- processor = AutoImageProcessor.from_pretrained(model_id)
90
- model = AutoModelForImageClassification.from_pretrained(model_id)
 
 
 
 
 
 
91
 
92
- image = Image.open("manuscript_page.jpg").convert("RGB")
93
- inputs = processor(images=image, return_tensors="pt")
94
 
95
- with torch.no_grad():
96
- outputs = model(**inputs)
97
- prediction = outputs.logits.argmax(-1).item()
98
 
99
- print(f"Detected Script: {model.config.id2label[prediction]}")
 
1
  ---
 
 
2
  license: apache-2.0
 
3
  tags:
4
+ - image-classification
5
+ - tibetan
6
+ - uchen
7
+ - ume
8
+ library_name: transformers
9
+ pipeline_tag: image-classification
 
 
 
 
 
 
 
10
  ---
11
 
12
+ # Uchen vs Umê classifier (DINOv3 ViT-S)
13
 
14
+ Binary Tibetan script classifier: **uchen** (printed) vs **ume** (cursive).
15
 
16
+ ## Recommended weights
17
 
18
+ Use **`without_preprocess/final_model.pt`** for **full manuscript pages** (no center crop at inference).
 
 
 
 
 
 
19
 
20
+ | Variant | Preprocess at train | Test F1 | Benchmark F1 |
21
+ |---------|---------------------|---------|----------------|
22
+ | `with_preprocess/` | center_crop train/val, none on test | 0.506 | n/a |
23
+ | `without_preprocess/` | no runtime preprocess | 0.708 | 0.848 |
24
 
25
+ ## Benchmark evaluation (held-out 60 pages)
26
 
27
+ The benchmark set is **disjoint** from train/val/test. After downloading this repo:
28
 
29
+ ```bash
30
+ pip install torch transformers pillow huggingface_hub scikit-learn
 
 
 
31
 
32
+ # From the dataset repo (has benchmark/ images + inference_uchen_ume.py):
33
+ python inference_uchen_ume.py \
34
+ --benchmark-dir benchmark \
35
+ --model-repo openpecha/uchen-ume-classifier \
36
+ --weights without_preprocess/final_model.pt \
37
+ --preprocess none
38
+ ```
 
 
 
 
 
 
 
 
 
39
 
40
+ Or from this training codebase:
41
 
42
+ ```bash
43
+ python experiments/uchen_ume_binary/eval_benchmark.py \
44
+ --checkpoint hf_upload/model/without_preprocess/final_model.pt \
45
+ --benchmark-dir benchmark/benchmark
46
+ ```
47
 
48
+ Reference benchmark run (without_preprocess): **acc 85.0%**, **macro F1 0.848** (30 uchen + 30 ume, full pages, no crop).
 
 
49
 
50
+ ## Load in Python
 
 
 
 
 
 
51
 
52
  ```python
 
53
  import torch
54
+ from huggingface_hub import hf_hub_download
55
+ from transformers import AutoImageProcessor
56
  from PIL import Image
57
 
58
+ # See dataset repo inference_uchen_ume.py for DINOv3Classifier + predict
59
+ path = hf_hub_download(
60
+ "openpecha/uchen-ume-classifier",
61
+ "without_preprocess/final_model.pt",
62
+ repo_type="model",
63
+ )
64
+ ckpt = torch.load(path, map_location="cpu", weights_only=False)
65
+ ```
66
+
67
+ ## Do not use `with_preprocess/` on full pages
68
 
69
+ That variant was trained with **center-crop** on train/val; test F1 on full pages is ~0.51. Only use it with `--preprocess center_crop_whole_page`.
 
70
 
71
+ ## Training
 
 
72
 
73
+ Backbone: `facebook/dinov3-vits16-pretrain-lvd1689m`. Progressive unfreeze (stages A/B/C). Dataset: [openpecha/uchen-ume-classification-benchmark](https://huggingface.co/datasets/openpecha/uchen-ume-classification-benchmark).
with_preprocess/best_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7646471471367b77ed76fa52ab119075136d00ccdb8d1770c349e7b7e9998196
3
+ size 86674972
with_preprocess/best_checkpoint_name.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ best_stage_c_last_blocks.pt
with_preprocess/classification_report.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ precision recall f1-score support
2
+
3
+ uchen 0.21 1.00 0.34 99
4
+ ume 1.00 0.50 0.67 768
5
+
6
+ accuracy 0.56 867
7
+ macro avg 0.60 0.75 0.51 867
8
+ weighted avg 0.91 0.56 0.63 867
with_preprocess/confusion_matrix.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels": [
3
+ "uchen",
4
+ "ume"
5
+ ],
6
+ "matrix": [
7
+ [
8
+ 99,
9
+ 0
10
+ ],
11
+ [
12
+ 381,
13
+ 387
14
+ ]
15
+ ]
16
+ }
with_preprocess/confusion_matrix.png ADDED
with_preprocess/final_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e11aee6ea8fd2bbe0090c384580d19fbcd74d66b667ac68e9ad1481c30c9fd70
3
+ size 86672201
with_preprocess/model_card.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "variant": "with_preprocess",
3
+ "experiment": "uchen_ume_whole_page",
4
+ "best_checkpoint": "best_stage_c_last_blocks.pt",
5
+ "val_macro_f1": 0.9938033069400111,
6
+ "val_accuracy": 0.9970414201183432,
7
+ "epoch": 9,
8
+ "test_metrics": {
9
+ "loss": 1.5028612467717066,
10
+ "accuracy": 0.5605536332179931,
11
+ "macro_f1": 0.5060493910234842,
12
+ "weighted_f1": 0.6326582036211453,
13
+ "auc_roc": 0.9685921717171717
14
+ }
15
+ }
with_preprocess/results.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "experiment": "uchen_ume_whole_page",
3
+ "stage_run": "test",
4
+ "test_metrics": {
5
+ "loss": 1.5028612467717066,
6
+ "accuracy": 0.5605536332179931,
7
+ "macro_f1": 0.5060493910234842,
8
+ "weighted_f1": 0.6326582036211453,
9
+ "auc_roc": 0.9685921717171717
10
+ },
11
+ "history": {},
12
+ "report": " precision recall f1-score support\n\n uchen 0.21 1.00 0.34 99\n ume 1.00 0.50 0.67 768\n\n accuracy 0.56 867\n macro avg 0.60 0.75 0.51 867\nweighted avg 0.91 0.56 0.63 867\n",
13
+ "splits_file": "/root/script-classification-model-train/experiments/uchen_ume_binary/checkpoints/uchen_ume_whole_page/splits.json",
14
+ "skip_stage_c": false,
15
+ "stage_c_skip_reason": null,
16
+ "best_checkpoint": "best_stage_c_last_blocks.pt"
17
+ }
without_preprocess/benchmark_classification_report.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ precision recall f1-score support
2
+
3
+ uchen 0.78 0.97 0.87 30
4
+ ume 0.96 0.73 0.83 30
5
+
6
+ accuracy 0.85 60
7
+ macro avg 0.87 0.85 0.85 60
8
+ weighted avg 0.87 0.85 0.85 60
without_preprocess/benchmark_confusion_matrix.png ADDED
without_preprocess/benchmark_eval_results.json ADDED
@@ -0,0 +1,386 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "checkpoint": "/root/script-classification-model-train/hf_upload/model/without_preprocess/final_model.pt",
3
+ "benchmark_dir": "/root/script-classification-model-train/benchmark/benchmark",
4
+ "n_images": 60,
5
+ "preprocess": "none",
6
+ "metrics": {
7
+ "loss": 0.3956117908159892,
8
+ "accuracy": 0.85,
9
+ "macro_f1": 0.847930160518164,
10
+ "weighted_f1": 0.847930160518164,
11
+ "auc_roc": 0.97
12
+ },
13
+ "report": " precision recall f1-score support\n\n uchen 0.78 0.97 0.87 30\n ume 0.96 0.73 0.83 30\n\n accuracy 0.85 60\n macro avg 0.87 0.85 0.85 60\nweighted avg 0.87 0.85 0.85 60\n",
14
+ "confusion_matrix": [
15
+ [
16
+ 29,
17
+ 1
18
+ ],
19
+ [
20
+ 8,
21
+ 22
22
+ ]
23
+ ],
24
+ "predictions": [
25
+ {
26
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W00KG0555-I1KG1104880320.jpg",
27
+ "label": "uchen",
28
+ "pred": "uchen",
29
+ "prob_ume": 4.636118600132022e-09
30
+ },
31
+ {
32
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W1GS66367-I1GS663690005.jpg",
33
+ "label": "uchen",
34
+ "pred": "uchen",
35
+ "prob_ume": 3.902857861248776e-05
36
+ },
37
+ {
38
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W1PD153537-I1KG131310005.jpg",
39
+ "label": "uchen",
40
+ "pred": "uchen",
41
+ "prob_ume": 0.0006930792005732656
42
+ },
43
+ {
44
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W1PD89084-I1KG134060005.jpg",
45
+ "label": "uchen",
46
+ "pred": "uchen",
47
+ "prob_ume": 6.966766704863403e-06
48
+ },
49
+ {
50
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W23768-26400446.jpg",
51
+ "label": "uchen",
52
+ "pred": "uchen",
53
+ "prob_ume": 0.001230051158927381
54
+ },
55
+ {
56
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W2PD16917-I3PD5910005.jpg",
57
+ "label": "uchen",
58
+ "pred": "uchen",
59
+ "prob_ume": 0.0011563787702471018
60
+ },
61
+ {
62
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W2PD17514-I4PD22210005.jpg",
63
+ "label": "uchen",
64
+ "pred": "uchen",
65
+ "prob_ume": 4.242213981342502e-05
66
+ },
67
+ {
68
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W2PD17517-I4PD15200005.jpg",
69
+ "label": "uchen",
70
+ "pred": "uchen",
71
+ "prob_ume": 8.787653496256098e-05
72
+ },
73
+ {
74
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W2PD19474-I2PD198170005.jpg",
75
+ "label": "uchen",
76
+ "pred": "ume",
77
+ "prob_ume": 0.652681291103363
78
+ },
79
+ {
80
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W2PD20866-I4PD50820480.jpg",
81
+ "label": "uchen",
82
+ "pred": "uchen",
83
+ "prob_ume": 0.01092884037643671
84
+ },
85
+ {
86
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN21390-I3CN222550962.jpg",
87
+ "label": "uchen",
88
+ "pred": "uchen",
89
+ "prob_ume": 0.0006489027291536331
90
+ },
91
+ {
92
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN21414-I2KG2203010005.jpg",
93
+ "label": "uchen",
94
+ "pred": "uchen",
95
+ "prob_ume": 4.1007406252902e-06
96
+ },
97
+ {
98
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN21482-I4CN121200005.jpg",
99
+ "label": "uchen",
100
+ "pred": "uchen",
101
+ "prob_ume": 3.8675672840327024e-05
102
+ },
103
+ {
104
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN27530-I4CN129320548.jpg",
105
+ "label": "uchen",
106
+ "pred": "uchen",
107
+ "prob_ume": 0.007052543107420206
108
+ },
109
+ {
110
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN4180-I3CN41900005.jpg",
111
+ "label": "uchen",
112
+ "pred": "uchen",
113
+ "prob_ume": 0.1726490557193756
114
+ },
115
+ {
116
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN766-I3CN7680053.jpg",
117
+ "label": "uchen",
118
+ "pred": "uchen",
119
+ "prob_ume": 0.0010180854005739093
120
+ },
121
+ {
122
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3CN8329-I3CN83400005.jpg",
123
+ "label": "uchen",
124
+ "pred": "uchen",
125
+ "prob_ume": 0.0019903830252587795
126
+ },
127
+ {
128
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3MS261-I3MS3570044.jpg",
129
+ "label": "uchen",
130
+ "pred": "uchen",
131
+ "prob_ume": 0.0010262312134727836
132
+ },
133
+ {
134
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3MS701-I3MS7080356.jpg",
135
+ "label": "uchen",
136
+ "pred": "uchen",
137
+ "prob_ume": 6.8632389229605906e-06
138
+ },
139
+ {
140
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3PD885-I3PD9130186.jpg",
141
+ "label": "uchen",
142
+ "pred": "uchen",
143
+ "prob_ume": 1.0850219041458331e-05
144
+ },
145
+ {
146
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W3PD988-I3PD13200005.jpg",
147
+ "label": "uchen",
148
+ "pred": "uchen",
149
+ "prob_ume": 0.005023940000683069
150
+ },
151
+ {
152
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4CZ58520-I4CZ751270005.jpg",
153
+ "label": "uchen",
154
+ "pred": "uchen",
155
+ "prob_ume": 0.0084664486348629
156
+ },
157
+ {
158
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4CZ74080-I4CZ741090231.jpg",
159
+ "label": "uchen",
160
+ "pred": "uchen",
161
+ "prob_ume": 8.514942351212085e-08
162
+ },
163
+ {
164
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4PD1207-I4PD12980005.jpg",
165
+ "label": "uchen",
166
+ "pred": "uchen",
167
+ "prob_ume": 1.6211478826022585e-09
168
+ },
169
+ {
170
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4PD2050-I4PD20570005.jpg",
171
+ "label": "uchen",
172
+ "pred": "uchen",
173
+ "prob_ume": 0.0007123054238036275
174
+ },
175
+ {
176
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4PD294-I4PD4110418.jpg",
177
+ "label": "uchen",
178
+ "pred": "uchen",
179
+ "prob_ume": 9.038657822202367e-07
180
+ },
181
+ {
182
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4PD3075-I4PD31350005.jpg",
183
+ "label": "uchen",
184
+ "pred": "uchen",
185
+ "prob_ume": 7.926726539153606e-05
186
+ },
187
+ {
188
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W4PD3076-I4PD30840005.jpg",
189
+ "label": "uchen",
190
+ "pred": "uchen",
191
+ "prob_ume": 0.00010477022442501038
192
+ },
193
+ {
194
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W8LS19724-I8LS197260350.jpg",
195
+ "label": "uchen",
196
+ "pred": "uchen",
197
+ "prob_ume": 0.001046415651217103
198
+ },
199
+ {
200
+ "path": "/root/script-classification-model-train/benchmark/benchmark/uchen/W8LS32739-I8LS339710642.jpg",
201
+ "label": "uchen",
202
+ "pred": "uchen",
203
+ "prob_ume": 2.5808612917899154e-05
204
+ },
205
+ {
206
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W1CZ1276-I1CZ17730007.png",
207
+ "label": "ume",
208
+ "pred": "ume",
209
+ "prob_ume": 0.9111959338188171
210
+ },
211
+ {
212
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W1CZ2157-I1CZ22190024.jpg",
213
+ "label": "ume",
214
+ "pred": "ume",
215
+ "prob_ume": 0.9920614957809448
216
+ },
217
+ {
218
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W1KG22576-I1KG225850005.jpg",
219
+ "label": "ume",
220
+ "pred": "uchen",
221
+ "prob_ume": 0.2963232696056366
222
+ },
223
+ {
224
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W1KG4616-I1KG48070506.jpg",
225
+ "label": "ume",
226
+ "pred": "ume",
227
+ "prob_ume": 0.9999186992645264
228
+ },
229
+ {
230
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W21872-62960005.jpg",
231
+ "label": "ume",
232
+ "pred": "ume",
233
+ "prob_ume": 0.5267542600631714
234
+ },
235
+ {
236
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W23751-I01JW1650259.png",
237
+ "label": "ume",
238
+ "pred": "ume",
239
+ "prob_ume": 0.9908739328384399
240
+ },
241
+ {
242
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W24012-36670120.jpg",
243
+ "label": "ume",
244
+ "pred": "ume",
245
+ "prob_ume": 0.9999780654907227
246
+ },
247
+ {
248
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W2CZ7987-I1KG38730018.jpg",
249
+ "label": "ume",
250
+ "pred": "uchen",
251
+ "prob_ume": 0.334159791469574
252
+ },
253
+ {
254
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W2PD17458-I4PD7310005.jpg",
255
+ "label": "ume",
256
+ "pred": "ume",
257
+ "prob_ume": 0.9986664056777954
258
+ },
259
+ {
260
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W2PD17471-I4PD7150594.jpg",
261
+ "label": "ume",
262
+ "pred": "ume",
263
+ "prob_ume": 0.9989309906959534
264
+ },
265
+ {
266
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W2PD17514-I4PD23060774.jpg",
267
+ "label": "ume",
268
+ "pred": "uchen",
269
+ "prob_ume": 0.4968777000904083
270
+ },
271
+ {
272
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN11633-I3CN116580005.jpg",
273
+ "label": "ume",
274
+ "pred": "uchen",
275
+ "prob_ume": 0.45208024978637695
276
+ },
277
+ {
278
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN21413-I2KG2202860536.jpg",
279
+ "label": "ume",
280
+ "pred": "ume",
281
+ "prob_ume": 0.9999990463256836
282
+ },
283
+ {
284
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN21798-I4CN123460005.jpg",
285
+ "label": "ume",
286
+ "pred": "uchen",
287
+ "prob_ume": 0.0006142983329482377
288
+ },
289
+ {
290
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN4061-I3CN64760512.jpg",
291
+ "label": "ume",
292
+ "pred": "ume",
293
+ "prob_ume": 0.8909825682640076
294
+ },
295
+ {
296
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN644-I3CN6460652.jpg",
297
+ "label": "ume",
298
+ "pred": "uchen",
299
+ "prob_ume": 0.032793644815683365
300
+ },
301
+ {
302
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN786-I3CN7910005.jpg",
303
+ "label": "ume",
304
+ "pred": "ume",
305
+ "prob_ume": 0.9999990463256836
306
+ },
307
+ {
308
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN8231-I3CN82600005.jpg",
309
+ "label": "ume",
310
+ "pred": "ume",
311
+ "prob_ume": 0.9999237060546875
312
+ },
313
+ {
314
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3CN8329-I3CN83570846.jpg",
315
+ "label": "ume",
316
+ "pred": "ume",
317
+ "prob_ume": 0.7273164987564087
318
+ },
319
+ {
320
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3JT13691-I3JT137050066.jpg",
321
+ "label": "ume",
322
+ "pred": "uchen",
323
+ "prob_ume": 0.19064322113990784
324
+ },
325
+ {
326
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3MS155-I1KG227891302.jpg",
327
+ "label": "ume",
328
+ "pred": "ume",
329
+ "prob_ume": 0.9999719858169556
330
+ },
331
+ {
332
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3PD988-I3PD13550824.jpg",
333
+ "label": "ume",
334
+ "pred": "ume",
335
+ "prob_ume": 0.9970397353172302
336
+ },
337
+ {
338
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W3PD989-I3PD10890005.jpg",
339
+ "label": "ume",
340
+ "pred": "ume",
341
+ "prob_ume": 0.705629289150238
342
+ },
343
+ {
344
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W4CZ58520-I4CZ750920618.jpg",
345
+ "label": "ume",
346
+ "pred": "uchen",
347
+ "prob_ume": 0.028447365388274193
348
+ },
349
+ {
350
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W4PD1703-I4PD17140005.jpg",
351
+ "label": "ume",
352
+ "pred": "ume",
353
+ "prob_ume": 0.5697239637374878
354
+ },
355
+ {
356
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W8LS16434-I8LS164450005.jpg",
357
+ "label": "ume",
358
+ "pred": "ume",
359
+ "prob_ume": 0.9901788234710693
360
+ },
361
+ {
362
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W8LS16555-I8LS165890026.jpg",
363
+ "label": "ume",
364
+ "pred": "ume",
365
+ "prob_ume": 0.9999986886978149
366
+ },
367
+ {
368
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W8LS17770-I8LS177950005.jpg",
369
+ "label": "ume",
370
+ "pred": "ume",
371
+ "prob_ume": 0.9999395608901978
372
+ },
373
+ {
374
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W8LS19804-I8LS198060061.jpg",
375
+ "label": "ume",
376
+ "pred": "ume",
377
+ "prob_ume": 0.9403765201568604
378
+ },
379
+ {
380
+ "path": "/root/script-classification-model-train/benchmark/benchmark/ume/W8LS20177-I8LS201790005.jpg",
381
+ "label": "ume",
382
+ "pred": "ume",
383
+ "prob_ume": 0.644355058670044
384
+ }
385
+ ]
386
+ }
without_preprocess/best_checkpoint.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8809ead826cfe08fbdfa3ca0659c858f2d3e98c21e62d63811905a6dd0c44abc
3
+ size 86674972
without_preprocess/best_checkpoint_name.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ best_stage_c_last_blocks.pt
without_preprocess/classification_report.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ precision recall f1-score support
2
+
3
+ uchen 0.37 0.98 0.54 99
4
+ ume 1.00 0.79 0.88 768
5
+
6
+ accuracy 0.81 867
7
+ macro avg 0.68 0.88 0.71 867
8
+ weighted avg 0.93 0.81 0.84 867
without_preprocess/confusion_matrix.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels": [
3
+ "uchen",
4
+ "ume"
5
+ ],
6
+ "matrix": [
7
+ [
8
+ 97,
9
+ 2
10
+ ],
11
+ [
12
+ 165,
13
+ 603
14
+ ]
15
+ ]
16
+ }
without_preprocess/confusion_matrix.png ADDED
without_preprocess/final_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:906a64dc6248dd54c08ad71e84cb223dcc22f2e7f613d3157d471908a5c6256f
3
+ size 86672201
without_preprocess/model_card.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "variant": "without_preprocess",
3
+ "experiment": "uchen_ume_binary",
4
+ "best_checkpoint": "best_stage_c_last_blocks.pt",
5
+ "val_macro_f1": 0.7705722639933166,
6
+ "val_accuracy": 0.8461538461538461,
7
+ "epoch": 3,
8
+ "test_metrics": {
9
+ "loss": 0.48820294297059763,
10
+ "accuracy": 0.8073817762399077,
11
+ "macro_f1": 0.7078823289680483,
12
+ "weighted_f1": 0.8394339697286689,
13
+ "auc_roc": 0.9698679503367003
14
+ }
15
+ }
without_preprocess/results.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "experiment": "uchen_ume_binary",
3
+ "stage_run": "test",
4
+ "test_metrics": {
5
+ "loss": 0.48820294297059763,
6
+ "accuracy": 0.8073817762399077,
7
+ "macro_f1": 0.7078823289680483,
8
+ "weighted_f1": 0.8394339697286689,
9
+ "auc_roc": 0.9698679503367003
10
+ },
11
+ "history": {},
12
+ "report": " precision recall f1-score support\n\n uchen 0.37 0.98 0.54 99\n ume 1.00 0.79 0.88 768\n\n accuracy 0.81 867\n macro avg 0.68 0.88 0.71 867\nweighted avg 0.93 0.81 0.84 867\n",
13
+ "splits_file": "/root/script-classification-model-train/experiments/uchen_ume_binary/checkpoints/uchen_ume_binary/splits.json",
14
+ "skip_stage_c": false,
15
+ "stage_c_skip_reason": null,
16
+ "best_checkpoint": "best_stage_c_last_blocks.pt"
17
+ }