Tibetan Script Router (DINOv3-ViT-S)
This model is a fine-tuned version of Meta's DINOv3-ViT-S/16 specifically designed for high-precision binary classification of Tibetan scripts. It acts as the primary "Router" in a hierarchical classification pipeline, distinguishing between formal block scripts (Uchen) and cursive families (Ume).
Model Details
- Project Name: The BDRC Etext Corpus
- Developed by: Dharmaduta
- Specifications provided by: Buddhist Digital Resource Center (BDRC)
- Funded by: Khyentse Foundation
- Model type: Vision Transformer (ViT)
- License: Apache 2.0
- Fine-tuned from:
facebook/dinov3-vits16-pretrain-lvd1689m
Dataset & Class Distribution
The model was trained using the openpecha/uchen-ume-classification dataset. This training set consists of 4,572 images balanced across two major categories.
The binary classes were mapped from the following granular script types:
1. Uchen (Class 0) β 2,286 Total Samples
| Granular Script Type | Sample Count |
|---|---|
uchen_sugdring |
1,670 |
uchen_sugthung |
616 |
2. Ume (Class 1) β 2,286 Total Samples
| Granular Script Type | Sample Count |
|---|---|
petsuk |
1,388 |
tsegdrig |
749 |
peri |
614 |
druthung |
207 |
tsumachug |
178 |
yigchung |
166 |
drudring |
132 |
drathung |
129 |
druring |
119 |
khyuyig |
113 |
dhumri |
98 |
tsugchung |
77 |
trinyig |
42 |
Note: Classes labeled "Difficult," "Multi-script," and "Non-Tibetan" were excluded to maintain a clean training signal for the Uchen/Ume boundary.
Performance Summary
The model achieved its peak performance at Stage B (Partial backbone unfreezing of the last 2 blocks).
- Test Accuracy: 98.95%
- Macro F1-Score: 0.984
- AUC-ROC: 0.9988
Confusion Matrix
| Predicted \ Actual | Uchen | Ume |
|---|---|---|
| Uchen | 159 | 2 |
| Ume | 6 | 595 |
How to Get Started
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image
# Note: Gated access approval for DINOv3 is required
model_id = "openpecha/uchen-ume-classifier"
processor = AutoImageProcessor.from_pretrained(model_id)
model = AutoModelForImageClassification.from_pretrained(model_id)
image = Image.open("manuscript_page.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
print(f"Detected Script: {model.config.id2label[prediction]}")
Model tree for openpecha/uchen-ume-classifier
Base model
facebook/dinov3-vit7b16-pretrain-lvd1689m