Update README.md
Browse files
README.md
CHANGED
|
@@ -3,4 +3,21 @@ license: gpl-3.0
|
|
| 3 |
base_model:
|
| 4 |
- naver-clova-ix/donut-base
|
| 5 |
pipeline_tag: visual-document-retrieval
|
| 6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
base_model:
|
| 4 |
- naver-clova-ix/donut-base
|
| 5 |
pipeline_tag: visual-document-retrieval
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# HeR-T: Herbarium specimen label Recognition Transformer
|
| 9 |
+
|
| 10 |
+
## 📃 Paper
|
| 11 |
+
Application of computer vision to the automated extraction of metadata from natural history specimen labels: A case study on herbarium specimens (Under Review)
|
| 12 |
+
|
| 13 |
+
## 💁 Authors
|
| 14 |
+
Zacchigna, Jacopo; Liu, Weiwei; Pellegrino, Felice Andrea; Peron, Adriano; Roma-Marzio, Francesco; Peruzzi, Lorenzo; Martellos, Stefano
|
| 15 |
+
|
| 16 |
+
## 🚀 Overview
|
| 17 |
+
HeR-T (Herbarium specimen label Recognition Transformer) is a fine-tuned vision-language model designed for automated metadata extraction of history specimen labels, especially herbarium specimen labels. It leverages Donut-base and has been fine-tuned with 55,089 herbarium specimen images from the Herbarium of the University of Pisa (international acronym PI).
|
| 18 |
+
|
| 19 |
+
## 🔥 Features
|
| 20 |
+
- **Fine-tuned on** specimen images from the Herbarium of the University of Pisa for automated metadata extraction of history specimen labels
|
| 21 |
+
- **Supports** image inputs with labels containing printed, handwritten, or mixed-format texts
|
| 22 |
+
- **Evaluation**: Tree Edit Distance (TED) accuracy score with the formula max(0, 1−TED(pr, gt)/TED(φ, gt)), where gt, pr, and φ stand for ground truth, prediction, and empty trees respectively
|
| 23 |
+
- **Pre-trained weights** are loaded from Donut-base (naver-clova-ix/donut-base)
|