Image Classification
Transformers
Tibetan
tibetan
uchen
ume
script-classification
dinov3
fine-tuned
Eval Results (legacy)
Instructions to use openpecha/uchen-ume-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openpecha/uchen-ume-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="openpecha/uchen-ume-classifier") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openpecha/uchen-ume-classifier", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README with best full-page results and benchmark holdout metrics
Browse files
README.md
CHANGED
|
@@ -1,40 +1,54 @@
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
-
- bo
|
| 4 |
license: apache-2.0
|
| 5 |
tags:
|
| 6 |
-
- image-classification
|
| 7 |
-
- tibetan
|
| 8 |
-
- uchen
|
| 9 |
-
- ume
|
| 10 |
-
- script-classification
|
| 11 |
-
- dinov3
|
| 12 |
-
- fine-tuned
|
| 13 |
library_name: transformers
|
| 14 |
pipeline_tag: image-classification
|
| 15 |
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
|
| 16 |
datasets:
|
| 17 |
-
- openpecha/uchen-ume-classification-benchmark
|
| 18 |
metrics:
|
| 19 |
-
- f1
|
| 20 |
-
- accuracy
|
| 21 |
model-index:
|
| 22 |
-
- name: Uchen-Ume Classifier (DINOv3 ViT-S)
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
---
|
| 39 |
|
| 40 |
# Uchen vs Umê Classifier (DINOv3 ViT-S)
|
|
@@ -47,16 +61,25 @@ Binary Tibetan script classifier: **Uchen** (དབུ་ཅན།, headed/print
|
|
| 47 |
|
| 48 |
**Use `without_preprocess/final_model.pt`** for production. This model was trained and evaluated on full manuscript pages with no preprocessing — what you get is what you deploy.
|
| 49 |
|
| 50 |
-
##
|
| 51 |
|
| 52 |
-
Test set = 867 images, work-stratified split, no overlap with training works.
|
| 53 |
|
| 54 |
-
|
|
| 55 |
-
|------
|
| 56 |
-
| **`without_preprocess/`** (recommended) |
|
| 57 |
-
| `
|
|
|
|
|
|
|
| 58 |
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
## Training data
|
| 62 |
|
|
@@ -64,7 +87,7 @@ The `without_preprocess` variant is trained and tested on full pages — no mism
|
|
| 64 |
|-------|------:|-----------:|-----:|------:|
|
| 65 |
| Uchen | ~3,124 | ~340 | ~290 | ~3,754 |
|
| 66 |
| Ume | ~5,986 | ~660 | ~561 | ~7,207 |
|
| 67 |
-
| **Total** | **9,110** | **1,000** | **851** | **10,961** |
|
| 68 |
|
| 69 |
**Uchen** includes: `uchen_sugthung`, `uchen_sugdring`, `uchen_sugring` (distinguished by descender length).
|
| 70 |
|
|
@@ -72,6 +95,8 @@ The `without_preprocess` variant is trained and tested on full pages — no mism
|
|
| 72 |
|
| 73 |
**Excluded:** `difficult`, `multi_scripts`, `non_tibetan`.
|
| 74 |
|
|
|
|
|
|
|
| 75 |
Splits are partitioned at the **work level** — all pages from the same manuscript (`W` prefix in the filename) stay in one split only.
|
| 76 |
|
| 77 |
## Architecture
|
|
@@ -136,6 +161,17 @@ label = "uchen" if probs[0] > probs[1] else "ume"
|
|
| 136 |
print(f"{label} ({probs.max():.1%})")
|
| 137 |
```
|
| 138 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 139 |
### Load the dataset
|
| 140 |
|
| 141 |
```python
|
|
@@ -145,6 +181,20 @@ ds = load_dataset("openpecha/uchen-ume-classification-benchmark")
|
|
| 145 |
train = ds["train"] # 9,110 images
|
| 146 |
val = ds["validation"] # 1,000 images
|
| 147 |
test = ds["test"] # 851 images
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 148 |
```
|
| 149 |
|
| 150 |
## Intended use
|
|
@@ -181,4 +231,4 @@ Manuscript image
|
|
| 181 |
|
| 182 |
## Acknowledgements
|
| 183 |
|
| 184 |
-
Developed by **Dharmaduta** for the **[Buddhist Digital Resource Center](https://www.bdrc.io)** (BDRC) Etext Corpus project, with funding from the **Khyentse Foundation**. Annotation guidelines by **Pentsok Rtsang**.
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
+
- bo
|
| 4 |
license: apache-2.0
|
| 5 |
tags:
|
| 6 |
+
- image-classification
|
| 7 |
+
- tibetan
|
| 8 |
+
- uchen
|
| 9 |
+
- ume
|
| 10 |
+
- script-classification
|
| 11 |
+
- dinov3
|
| 12 |
+
- fine-tuned
|
| 13 |
library_name: transformers
|
| 14 |
pipeline_tag: image-classification
|
| 15 |
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
|
| 16 |
datasets:
|
| 17 |
+
- openpecha/uchen-ume-classification-benchmark
|
| 18 |
metrics:
|
| 19 |
+
- f1
|
| 20 |
+
- accuracy
|
| 21 |
model-index:
|
| 22 |
+
- name: Uchen-Ume Classifier (DINOv3 ViT-S)
|
| 23 |
+
results:
|
| 24 |
+
- task:
|
| 25 |
+
type: image-classification
|
| 26 |
+
name: Tibetan Script Classification (Uchen vs Ume)
|
| 27 |
+
dataset:
|
| 28 |
+
name: openpecha/uchen-ume-classification-benchmark
|
| 29 |
+
type: openpecha/uchen-ume-classification-benchmark
|
| 30 |
+
split: test
|
| 31 |
+
metrics:
|
| 32 |
+
- name: Macro F1 (full page)
|
| 33 |
+
type: f1
|
| 34 |
+
value: 0.708
|
| 35 |
+
- name: Accuracy (full page)
|
| 36 |
+
type: accuracy
|
| 37 |
+
value: 0.807
|
| 38 |
+
- task:
|
| 39 |
+
type: image-classification
|
| 40 |
+
name: Held-out benchmark (60 pages, full page)
|
| 41 |
+
dataset:
|
| 42 |
+
name: openpecha/uchen-ume-classification-benchmark
|
| 43 |
+
type: openpecha/uchen-ume-classification-benchmark
|
| 44 |
+
split: benchmark
|
| 45 |
+
metrics:
|
| 46 |
+
- name: Macro F1 (full page)
|
| 47 |
+
type: f1
|
| 48 |
+
value: 0.848
|
| 49 |
+
- name: Accuracy (full page)
|
| 50 |
+
type: accuracy
|
| 51 |
+
value: 0.850
|
| 52 |
---
|
| 53 |
|
| 54 |
# Uchen vs Umê Classifier (DINOv3 ViT-S)
|
|
|
|
| 61 |
|
| 62 |
**Use `without_preprocess/final_model.pt`** for production. This model was trained and evaluated on full manuscript pages with no preprocessing — what you get is what you deploy.
|
| 63 |
|
| 64 |
+
## Best results (full pages)
|
| 65 |
|
| 66 |
+
Test set = 867 images, work-stratified split, no overlap with training works. Benchmark = 60 held-out pages (30 uchen / 30 ume), disjoint from train/val/test.
|
| 67 |
|
| 68 |
+
| Eval | Split | Images | Accuracy | Macro-F1 | AUC |
|
| 69 |
+
|------|-------|-------:|---------:|---------:|----:|
|
| 70 |
+
| **`without_preprocess/`** (recommended) | Test | 867 | **80.7%** | **0.708** | 0.970 |
|
| 71 |
+
| **`without_preprocess/`** (recommended) | Benchmark | 60 | **85.0%** | **0.848** | 0.970 |
|
| 72 |
+
| `with_preprocess/` | Test | 867 | 56.1% | 0.506 | 0.969 |
|
| 73 |
+
| `with_preprocess/` | Benchmark | 60 | 68.3% | 0.648 | 0.953 |
|
| 74 |
|
| 75 |
+
### Variant comparison
|
| 76 |
+
|
| 77 |
+
| Variant | Train/val preprocess | Test & benchmark preprocess | Test acc | Test macro-F1 | Benchmark acc | Benchmark macro-F1 |
|
| 78 |
+
|---------|---------------------|-----------------------------|:--------:|:-------------:|:-------------:|:------------------:|
|
| 79 |
+
| **`without_preprocess/`** | none | none (full page) | **80.7%** | **0.708** | **85.0%** | **0.848** |
|
| 80 |
+
| `with_preprocess/` | center crop | none (full page) | 56.1% | 0.506 | 68.3% | 0.648 |
|
| 81 |
+
|
| 82 |
+
The `without_preprocess` variant is trained and tested on full pages — no mismatch between training and inference. The `with_preprocess` variant achieves ~99% validation F1 on center-cropped images (matching its training distribution), but drops to 56% when tested on full pages because the model has never seen uncropped input. Do **not** report ~99% test scores from runs that center-crop test at eval time.
|
| 83 |
|
| 84 |
## Training data
|
| 85 |
|
|
|
|
| 87 |
|-------|------:|-----------:|-----:|------:|
|
| 88 |
| Uchen | ~3,124 | ~340 | ~290 | ~3,754 |
|
| 89 |
| Ume | ~5,986 | ~660 | ~561 | ~7,207 |
|
| 90 |
+
| **Total pages** | **9,110** | **1,000** | **851** | **10,961** |
|
| 91 |
|
| 92 |
**Uchen** includes: `uchen_sugthung`, `uchen_sugdring`, `uchen_sugring` (distinguished by descender length).
|
| 93 |
|
|
|
|
| 95 |
|
| 96 |
**Excluded:** `difficult`, `multi_scripts`, `non_tibetan`.
|
| 97 |
|
| 98 |
+
Benchmark pages (60) are excluded from train/val/test via the published split manifest.
|
| 99 |
+
|
| 100 |
Splits are partitioned at the **work level** — all pages from the same manuscript (`W` prefix in the filename) stay in one split only.
|
| 101 |
|
| 102 |
## Architecture
|
|
|
|
| 161 |
print(f"{label} ({probs.max():.1%})")
|
| 162 |
```
|
| 163 |
|
| 164 |
+
### Benchmark inference (full pages)
|
| 165 |
+
|
| 166 |
+
```bash
|
| 167 |
+
pip install -r https://huggingface.co/datasets/openpecha/uchen-ume-classification-benchmark/raw/main/requirements-inference.txt
|
| 168 |
+
python https://huggingface.co/datasets/openpecha/uchen-ume-classification-benchmark/raw/main/inference_uchen_ume.py \
|
| 169 |
+
--benchmark-json benchmark/benchmark_holdout.json \
|
| 170 |
+
--fetch-urls \
|
| 171 |
+
--weights without_preprocess/final_model.pt \
|
| 172 |
+
--preprocess none
|
| 173 |
+
```
|
| 174 |
+
|
| 175 |
### Load the dataset
|
| 176 |
|
| 177 |
```python
|
|
|
|
| 181 |
train = ds["train"] # 9,110 images
|
| 182 |
val = ds["validation"] # 1,000 images
|
| 183 |
test = ds["test"] # 851 images
|
| 184 |
+
bench = ds["benchmark"] # 60 images
|
| 185 |
+
```
|
| 186 |
+
|
| 187 |
+
## Repo layout
|
| 188 |
+
|
| 189 |
+
```
|
| 190 |
+
without_preprocess/ ← recommended (full-page test & benchmark)
|
| 191 |
+
final_model.pt
|
| 192 |
+
results.json
|
| 193 |
+
benchmark_eval_results.json
|
| 194 |
+
with_preprocess/ ← center-crop train/val only; test on full pages
|
| 195 |
+
final_model.pt
|
| 196 |
+
results.json
|
| 197 |
+
benchmark_eval_results.json
|
| 198 |
```
|
| 199 |
|
| 200 |
## Intended use
|
|
|
|
| 231 |
|
| 232 |
## Acknowledgements
|
| 233 |
|
| 234 |
+
Developed by **Dharmaduta** for the **[Buddhist Digital Resource Center](https://www.bdrc.io)** (BDRC) Etext Corpus project, with funding from the **Khyentse Foundation**. Annotation guidelines by **Pentsok Rtsang**.
|