Image Classification
Transformers
Tibetan
tibetan
script-classification
dinov3
binary
karma689's picture
Update Gyuyig vs Tsugdri binary classifier: center_crop 224 metrics, confusion matrix, training history
92d18ee verified
|
Raw
History Blame Contribute Delete
4.11 kB
metadata
language:
  - bo
license: apache-2.0
tags:
  - image-classification
  - tibetan
  - script-classification
  - dinov3
  - binary
library_name: transformers
pipeline_tag: image-classification
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
datasets:
  - BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset
metrics:
  - f1
  - accuracy
  - auc

Gyuyig vs Tsugdri Binary Script Classifier (DINOv3 ViT-S)

Fine-tuned DINOv3 ViT-S for parent script classification:

Gyuyig, Tsugdri

Experiment: dinov3_gyuyig_tsugdri_binary (gyuyig_tsugdri_binary_classification)
Pooling: ViT CLS token (last_hidden_state[:, 0, :])
Weights: final_model.pt (best validation macro-F1 across stages A/B/C)

Warm-start: BDRC/4-class-balanced-script-classifier (final_model.pt — prior test acc 92.1%, macro-F1 0.921)

Data

Test split: balanced benchmark (60 images per parent class, held out of training).

Preprocessing

Split Mode Size
train center_crop 224
val center_crop 224
test center_crop 224

Validation metrics (n=60)

Metric Value
Accuracy 91.7%
Macro F1 0.916
Weighted F1 0.916
AUC-ROC 0.931
Loss 0.3915

Best checkpoint: best_stage_c_last_blocks.pt epoch 1 val macro-F1 0.916

Per-class (validation)

precision    recall  f1-score   support

      Gyuyig       0.88      0.97      0.92        30
     Tsugdri       0.96      0.87      0.91        30

    accuracy                           0.92        60
   macro avg       0.92      0.92      0.92        60
weighted avg       0.92      0.92      0.92        60

Test / benchmark metrics (n=120)

Metric Value
Accuracy 85.0%
Macro F1 0.848
Weighted F1 0.848
AUC-ROC 0.930
Loss 0.4047

Per-class (test)

precision    recall  f1-score   support

      Gyuyig       0.78      0.97      0.87        60
     Tsugdri       0.96      0.73      0.83        60

    accuracy                           0.85       120
   macro avg       0.87      0.85      0.85       120
weighted avg       0.87      0.85      0.85       120

Training

Stage Epochs LR head LR backbone Unfrozen blocks
A 7 0.0005 0
B 10 0.0001 1e-05 4
C 12 5e-05 1.5e-05 8
Setting Value
Scheduler cosine_warmup
Class weights custom
Label smoothing 0.05
Dropout 0.1

Confusion matrix (test)

Confusion matrix

True \ Pred Gyuyig Tsugdri
Gyuyig 58 2
Tsugdri 16 44

Files

File Description
final_model.pt Best val-F1 weights + label maps
results.json Full metrics, history, warm-start info
config.yaml Training config
model_card.json Summary metadata
confusion_matrix.json / .png Test CM
training_history.png Stage loss / val F1 curves
split_stats.json / .md Per-class split counts
inference.py Classify image paths
requirements-inference.txt Pip deps

Inference

pip install -r requirements-inference.txt
python inference.py --checkpoint final_model.pt --image path/to/page.jpg --preprocess resize_letterbox --preprocess-size 224

Reproduce training

python experiments/gyuyig-tsugdri/train.py --config experiments/gyuyig-tsugdri/config_warmstart.yaml

Model repo: BDRC/gyuyig-tsugdri-binary-script-classifier