karma689 commited on
Commit
9aa85b7
·
verified ·
1 Parent(s): 43d1e2e

update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - bo
4
+ library_name: transformers
5
+ tags:
6
+ - image-classification
7
+ - dinov3
8
+ - tibetan
9
+ - manuscript
10
+ - binary-classification
11
+ - vision
12
+ datasets:
13
+ - OpenPecha/BDRC-Script-Data
14
+ metrics:
15
+ - accuracy
16
+ - f1
17
+ - auc_roc
18
+ base_model: facebook/dinov3-vits16-pretrain-lvd1689m
19
+ ---
20
+
21
+ # Uchen-Ume Binary Script Classifier
22
+
23
+ This model is a fine-tuned version of **Meta's DINOv3-ViT-S/16** for binary classification of Tibetan scripts (Uchen vs. Ume). It serves as the "Router" stage for a hierarchical classification pipeline.
24
+
25
+ ## Model Details
26
+
27
+ ### Model Description
28
+
29
+ The model was developed to provide a high-reliability baseline for separating formal block scripts (**Uchen**) from cursive script families (**Ume**). By focusing on global page geometry rather than local character patches, it achieves high accuracy on whole-page manuscript scans.
30
+
31
+ - **Developed by:** OpenPecha / [Your Name]
32
+ - **Model type:** Vision Transformer (ViT)
33
+ - **Language(s):** Tibetan (Classical/Manuscript)
34
+ - **Finetuned from model:** facebook/dinov3-vits16-pretrain-lvd1689m
35
+
36
+ ## Uses
37
+
38
+ ### Direct Use
39
+
40
+ This model is intended to be used as a **pre-processing filter** or **router**. It can automatically sort large digital archives into Uchen or Ume folders to be processed by specialized downstream OCR engines.
41
+
42
+ ### Out-of-Scope Use
43
+
44
+ - Classification of modern printed Tibetan fonts (untested).
45
+ - Recognition of non-Tibetan scripts (Sanskrit, Lantsa, etc.).
46
+ - Character-level recognition (OCR).
47
+
48
+ ## Bias, Risks, and Limitations
49
+
50
+ The model was trained primarily on BDRC (Buddhist Digital Resource Center) manuscript scans. It may struggle with:
51
+ - Extremely faint or damaged woodblock prints.
52
+ - Pages containing a roughly equal mix of both Uchen and Ume (Multi-script).
53
+
54
+ ## How to Get Started with the Model
55
+
56
+ ```python
57
+ from transformers import AutoImageProcessor, AutoModelForImageClassification
58
+ import torch
59
+ from PIL import Image
60
+
61
+ processor = AutoImageProcessor.from_pretrained("your-username/uchen-ume-classifier")
62
+ model = AutoModelForImageClassification.from_pretrained("your-username/uchen-ume-classifier")
63
+
64
+ image = Image.open("manuscript_page.jpg").convert("RGB")
65
+ inputs = processor(images=image, return_tensors="pt")
66
+
67
+ with torch.no_grad():
68
+ outputs = model(**inputs)
69
+ prediction = outputs.logits.argmax(-1).item()
70
+
71
+ print(f"Detected Script: {model.config.id2label[prediction]}")