--- language: - bo license: apache-2.0 tags: - image-classification - tibetan - script-classification - dinov3 - binary library_name: transformers pipeline_tag: image-classification base_model: facebook/dinov3-vits16-pretrain-lvd1689m datasets: - BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset metrics: - f1 - accuracy - auc --- # Gyuyig vs Tsugdri Binary Script Classifier (DINOv3 ViT-S) Fine-tuned [DINOv3 ViT-S](https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m) for parent script classification: **Gyuyig**, **Tsugdri** **Experiment:** `dinov3_gyuyig_tsugdri_binary` (`gyuyig_tsugdri_binary_classification`) **Pooling:** ViT **CLS token** (`last_hidden_state[:, 0, :]`) **Weights:** `final_model.pt` (best validation macro-F1 across stages A/B/C) **Warm-start:** [BDRC/4-class-balanced-script-classifier](https://huggingface.co/BDRC/4-class-balanced-script-classifier) (`final_model.pt` — prior test acc 92.1%, macro-F1 0.921) ## Data | Split | Source | |-------|--------| | Train / val / test | [BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset](https://huggingface.co/datasets/BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset) | Test split: balanced benchmark (60 images per parent class, held out of training). ## Preprocessing | Split | Mode | Size | |-------|------|-----:| | train | `center_crop` | 224 | | val | `center_crop` | 224 | | test | `center_crop` | 224 | ## Validation metrics (n=60) | Metric | Value | |--------|------:| | Accuracy | 91.7% | | Macro F1 | 0.916 | | Weighted F1 | 0.916 | | AUC-ROC | 0.931 | | Loss | 0.3915 | **Best checkpoint:** `best_stage_c_last_blocks.pt` epoch 1 val macro-F1 0.916 ### Per-class (validation) ``` precision recall f1-score support Gyuyig 0.88 0.97 0.92 30 Tsugdri 0.96 0.87 0.91 30 accuracy 0.92 60 macro avg 0.92 0.92 0.92 60 weighted avg 0.92 0.92 0.92 60 ``` ## Test / benchmark metrics (n=120) | Metric | Value | |--------|------:| | Accuracy | 85.0% | | Macro F1 | 0.848 | | Weighted F1 | 0.848 | | AUC-ROC | 0.930 | | Loss | 0.4047 | ### Per-class (test) ``` precision recall f1-score support Gyuyig 0.78 0.97 0.87 60 Tsugdri 0.96 0.73 0.83 60 accuracy 0.85 120 macro avg 0.87 0.85 0.85 120 weighted avg 0.87 0.85 0.85 120 ``` ## Training | Stage | Epochs | LR head | LR backbone | Unfrozen blocks | |-------|-------:|--------:|------------:|----------------:| | A | 7 | 0.0005 | — | 0 | | B | 10 | 0.0001 | 1e-05 | 4 | | C | 12 | 5e-05 | 1.5e-05 | 8 | | Setting | Value | |---------|-------| | Scheduler | `cosine_warmup` | | Class weights | `custom` | | Label smoothing | 0.05 | | Dropout | 0.1 | ## Confusion matrix (test) ![Confusion matrix](confusion_matrix.png) | True \ Pred | Gyuyig | Tsugdri | |---|---:|---:| | **Gyuyig** | 58 | 2 | | **Tsugdri** | 16 | 44 | ## Files | File | Description | |------|-------------| | `final_model.pt` | Best val-F1 weights + label maps | | `results.json` | Full metrics, history, warm-start info | | `config.yaml` | Training config | | `model_card.json` | Summary metadata | | `confusion_matrix.json` / `.png` | Test CM | | `training_history.png` | Stage loss / val F1 curves | | `split_stats.json` / `.md` | Per-class split counts | | `inference.py` | Classify image paths | | `requirements-inference.txt` | Pip deps | ## Inference ```bash pip install -r requirements-inference.txt python inference.py --checkpoint final_model.pt --image path/to/page.jpg --preprocess resize_letterbox --preprocess-size 224 ``` ## Reproduce training ```bash python experiments/gyuyig-tsugdri/train.py --config experiments/gyuyig-tsugdri/config_warmstart.yaml ``` **Model repo:** [BDRC/gyuyig-tsugdri-binary-script-classifier](https://huggingface.co/BDRC/gyuyig-tsugdri-binary-script-classifier)