Image Classification
Transformers
Tibetan
tibetan
script-classification
dinov3
binary
karma689's picture
Update Gyuyig vs Tsugdri binary classifier: center_crop 224 metrics, confusion matrix, training history
92d18ee verified
|
Raw
History Blame Contribute Delete
4.11 kB
---
language:
- bo
license: apache-2.0
tags:
- image-classification
- tibetan
- script-classification
- dinov3
- binary
library_name: transformers
pipeline_tag: image-classification
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
datasets:
- BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset
metrics:
- f1
- accuracy
- auc
---
# Gyuyig vs Tsugdri Binary Script Classifier (DINOv3 ViT-S)
Fine-tuned [DINOv3 ViT-S](https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m) for parent script classification:
**Gyuyig**, **Tsugdri**
**Experiment:** `dinov3_gyuyig_tsugdri_binary` (`gyuyig_tsugdri_binary_classification`)
**Pooling:** ViT **CLS token** (`last_hidden_state[:, 0, :]`)
**Weights:** `final_model.pt` (best validation macro-F1 across stages A/B/C)
**Warm-start:** [BDRC/4-class-balanced-script-classifier](https://huggingface.co/BDRC/4-class-balanced-script-classifier) (`final_model.pt` — prior test acc 92.1%, macro-F1 0.921)
## Data
| Split | Source |
|-------|--------|
| Train / val / test | [BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset](https://huggingface.co/datasets/BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset) |
Test split: balanced benchmark (60 images per parent class, held out of training).
## Preprocessing
| Split | Mode | Size |
|-------|------|-----:|
| train | `center_crop` | 224 |
| val | `center_crop` | 224 |
| test | `center_crop` | 224 |
## Validation metrics (n=60)
| Metric | Value |
|--------|------:|
| Accuracy | 91.7% |
| Macro F1 | 0.916 |
| Weighted F1 | 0.916 |
| AUC-ROC | 0.931 |
| Loss | 0.3915 |
**Best checkpoint:** `best_stage_c_last_blocks.pt` epoch 1 val macro-F1 0.916
### Per-class (validation)
```
precision recall f1-score support
Gyuyig 0.88 0.97 0.92 30
Tsugdri 0.96 0.87 0.91 30
accuracy 0.92 60
macro avg 0.92 0.92 0.92 60
weighted avg 0.92 0.92 0.92 60
```
## Test / benchmark metrics (n=120)
| Metric | Value |
|--------|------:|
| Accuracy | 85.0% |
| Macro F1 | 0.848 |
| Weighted F1 | 0.848 |
| AUC-ROC | 0.930 |
| Loss | 0.4047 |
### Per-class (test)
```
precision recall f1-score support
Gyuyig 0.78 0.97 0.87 60
Tsugdri 0.96 0.73 0.83 60
accuracy 0.85 120
macro avg 0.87 0.85 0.85 120
weighted avg 0.87 0.85 0.85 120
```
## Training
| Stage | Epochs | LR head | LR backbone | Unfrozen blocks |
|-------|-------:|--------:|------------:|----------------:|
| A | 7 | 0.0005 | — | 0 |
| B | 10 | 0.0001 | 1e-05 | 4 |
| C | 12 | 5e-05 | 1.5e-05 | 8 |
| Setting | Value |
|---------|-------|
| Scheduler | `cosine_warmup` |
| Class weights | `custom` |
| Label smoothing | 0.05 |
| Dropout | 0.1 |
## Confusion matrix (test)
![Confusion matrix](confusion_matrix.png)
| True \ Pred | Gyuyig | Tsugdri |
|---|---:|---:|
| **Gyuyig** | 58 | 2 |
| **Tsugdri** | 16 | 44 |
## Files
| File | Description |
|------|-------------|
| `final_model.pt` | Best val-F1 weights + label maps |
| `results.json` | Full metrics, history, warm-start info |
| `config.yaml` | Training config |
| `model_card.json` | Summary metadata |
| `confusion_matrix.json` / `.png` | Test CM |
| `training_history.png` | Stage loss / val F1 curves |
| `split_stats.json` / `.md` | Per-class split counts |
| `inference.py` | Classify image paths |
| `requirements-inference.txt` | Pip deps |
## Inference
```bash
pip install -r requirements-inference.txt
python inference.py --checkpoint final_model.pt --image path/to/page.jpg --preprocess resize_letterbox --preprocess-size 224
```
## Reproduce training
```bash
python experiments/gyuyig-tsugdri/train.py --config experiments/gyuyig-tsugdri/config_warmstart.yaml
```
**Model repo:** [BDRC/gyuyig-tsugdri-binary-script-classifier](https://huggingface.co/BDRC/gyuyig-tsugdri-binary-script-classifier)