Object Detection
ultralytics
Tibetan
yolo
yolo26
tibetan
document-layout-analysis
bounding-box
BDRC
Eval Results (legacy)
Instructions to use BDRC/Tibetan_Modern_Book_Layout_Detection_Model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ultralytics
How to use BDRC/Tibetan_Modern_Book_Layout_Detection_Model with ultralytics:
# Couldn't find a valid YOLO version tag. # Replace XX with the correct version. from ultralytics import YOLOvXX model = YOLOvXX.from_pretrained("BDRC/Tibetan_Modern_Book_Layout_Detection_Model") source = 'http://images.cocodataset.org/val2017/000000039769.jpg' model.predict(source=source, save=True) - Notebooks
- Google Colab
- Kaggle
File size: 5,690 Bytes
b8f9e27 24dacee b8f9e27 24dacee b8f9e27 24dacee 9c08ccf 24dacee 9c08ccf 24dacee 146b1f7 24dacee 146b1f7 24dacee | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 | ---
library_name: ultralytics
task: object-detection
tags:
- yolo
- yolo26
- tibetan
- document-layout-analysis
- object-detection
- bounding-box
- BDRC
language:
- bo
license: cc0-1.0
datasets:
- BDRC/TDLA-Training-Dataset
metrics:
- mAP50
- mAP50-95
- precision
- recall
model-index:
- name: TDLA-YOLO26m
results:
- task:
type: object-detection
name: Object Detection
dataset:
type: BDRC/TDLA-Training-Dataset
name: TDLA Training Dataset
split: val
metrics:
- type: mAP50
value: 0.982
name: mAP@0.5
- type: mAP50-95
value: 0.799
name: mAP@0.5:0.95
- type: precision
value: 0.966
name: Precision
- type: recall
value: 0.970
name: Recall
---
# TMBLD-YOLO26m — Tibetan Modern book layout dection
A fine-tuned **YOLO26m** object-detection model for **Tibetan Modern book layout dection**. The model detects four layout classes in Tibetan modern book page images: **header**, **Text area**, **footnote**, and **footer**.
## Model Description
This model was fine-tuned from the Ultralytics YOLO26m pretrained checkpoint on the [BDRC/TDLA-Training-Dataset](https://huggingface.co/datasets/BDRC/TDLA-Training-Dataset), a YOLO-format bounding-box dataset of Tibetan document pages sourced from the Buddhist Digital Resource Center (BDRC) digital library.
| Property | Value |
| --- | --- |
| **Architecture** | YOLO26m |
| **Task** | Object Detection |
| **Image size** | 640 × 640 |
| **Number of classes** | 4 |
| **Training platform** | Ultralytics HUB |
| **Weights file** | `Tibetan_modern_book_Layout_detection.pt` |
## Classes
| ID | Class | Description |
| --- | --- | --- |
| 0 | header | Page header region |
| 1 | Text area | Main body text region |
| 2 | footnote | Footnote region |
| 3 | footer | Page footer region |
## Performance
Evaluated on the validation split of the TDLA Training Dataset.
| Metric | Value |
| --- | --- |
| **Precision** | 0.966 |
| **Recall** | 0.970 |
| **mAP@0.5** | 0.982 |
| **mAP@0.5:0.95** | 0.799 |
### Training Loss (final epoch)
| Loss Component | Train | Val |
| --- | --- | --- |
| Box loss | 0.515 | 0.643 |
| Classification loss | 0.218 | 0.276 |
| DFL loss | 0.003 | 0.004 |
## Training Details
### Dataset
- **Dataset:** [BDRC/TDLA-Training-Dataset](https://huggingface.co/datasets/BDRC/TDLA-Training-Dataset)
- **Train images:** 2,692
- **Val images:** 103
- **Test images:** 313
- **Total annotations:** 14,705
- **Train/Val split:** Iterative multi-label stratification (seed 42, 80/20 ratio)
### Hyperparameters
| Parameter | Value |
| --- | --- |
| Epochs | 150 |
| Patience | 100 |
| Batch size | Auto (-1) |
| Image size | 640 |
| Optimizer | Auto (SGD) |
| Initial learning rate (lr0) | 0.01 |
| Final learning rate factor (lrf) | 0.01 |
| Momentum | 0.937 |
| Weight decay | 0.0005 |
| Warmup epochs | 3.0 |
| Warmup momentum | 0.8 |
| Warmup bias lr | 0.1 |
| AMP (mixed precision) | True |
| Pretrained | True |
| Deterministic | True |
| Seed | 0 |
### Loss Weights
| Component | Weight |
| --- | --- |
| Box | 7.5 |
| Classification | 0.5 |
| DFL | 1.5 |
### Augmentation
| Augmentation | Value |
| --- | --- |
| HSV-Hue | 0.015 |
| HSV-Saturation | 0.7 |
| HSV-Value | 0.4 |
| Translation | 0.1 |
| Scale | 0.5 |
| Flip left-right | 0.5 |
| Mosaic | 1.0 |
| Erasing | 0.4 |
| Close mosaic (last N epochs) | 10 |
| Auto augment | RandAugment |
## Usage
### Inference with Ultralytics
```python
from ultralytics import YOLO
model = YOLO("Tibetan_modern_book_Layout_detection.pt")
results = model.predict("page_image.jpg", imgsz=640)
for result in results:
boxes = result.boxes
for box in boxes:
cls_id = int(box.cls)
conf = float(box.conf)
xyxy = box.xyxy[0].tolist()
print(f"Class: {cls_id}, Confidence: {conf:.3f}, Box: {xyxy}")
```
### Batch Inference
```python
from ultralytics import YOLO
model = YOLO("Tibetan_modern_book_Layout_detection.pt")
results = model.predict("path/to/images/", imgsz=640, conf=0.25)
```
## Intended Use
This model is designed for automatic layout detection of modern Tibetan book pages. It can be used as a preprocessing step for:
- OCR pipelines on Tibetan documents
- Document digitization workflows
- Structured text extraction from scanned Tibetan texts
- Digital library cataloging and indexing
## Limitations
- Trained primarily on modern Tibetan book layouts; performance on historical manuscripts, woodblock prints, or non-standard layouts may vary.
- Optimized for 640×640 input resolution; very high-resolution pages may benefit from tiling or higher `imgsz` values.
- The footnote class has fewer training samples (456 annotations) compared to other classes, which may affect detection quality for that class.
## License
This model is released under the **CC0 1.0 Universal (Public Domain Dedication)**. You are free to copy, modify, and distribute the model, even for commercial purposes, without asking permission.
## Acknowledgements
This dataset was developed by Dharmaduta from specifications provided by the Buddhist Digital Resource Center (BDRC) for the BDRC Etext Corpus, with funding from the Khyentse Foundation.
## Citation
If you use this model, please cite the dataset:
```bibtex
@software{bdrc_tmbld_yolo26m_2026,
title = {tmbld-YOLO26m: Tibetan Modern book layout detection Model},
author = {Buddhist Digital Resource Center (BDRC)},
year = {2026},
url = {https://huggingface.co/BDRC/TDLA-YOLO26m},
license = {CC0-1.0}
}
``` |