MikkoLipsanen commited on
Commit
1c4f9bb
·
verified ·
1 Parent(s): 557cd90

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -6,6 +6,8 @@ language:
6
  metrics:
7
  - cer
8
  pipeline_tag: image-to-text
 
 
9
  ---
10
  # Model description
11
 
@@ -21,7 +23,7 @@ pipeline_tag: image-to-text
21
 
22
  **License:** Apache 2.0
23
 
24
- This model is a fine-tuned version of the microsoft/trocr-large-handwritten model, specialized for recognizing handwritten text. It has been trained on various dataset from 17th to 20th centuries and can be used for applications such as document digitization, form recognition, or any task involving handwritten text extraction.
25
 
26
  # Model Architecture
27
 
@@ -39,15 +41,15 @@ This model is designed for handwritten text recognition and is intended for use
39
 
40
  # Training data
41
 
42
- The training datasetincludes more than 760 000 samples of handwritten text rows, covering a wide variety of handwriting styles and text samples.
43
 
44
  # Evaluation
45
 
46
  The model was evaluated on test dataset. Below are key metrics:
47
 
48
- **Character Error Rate (CER):** 3.2
49
 
50
- **Test Dataset Description:** size ~94 900 text rows
51
 
52
  # Used Hyperparameters
53
 
@@ -55,11 +57,9 @@ The model was evaluated on test dataset. Below are key metrics:
55
 
56
  **Train batch size per device:** 16
57
 
58
- **Learning rate:** 1e-5
59
 
60
- **Scheduler:** linear
61
-
62
- **Warmup steps:** 500
63
 
64
  **Optimizer:** AdamW
65
 
@@ -69,6 +69,8 @@ The model was evaluated on test dataset. Below are key metrics:
69
 
70
  **Half precision backend:** cuda_amp
71
 
 
 
72
 
73
  # How to Use the Model
74
 
@@ -110,13 +112,13 @@ Potential improvements for this model include:
110
 
111
  If you use this model in your work, please cite it as:
112
 
113
- @misc{multicentury_htr_model_2024,
114
 
115
  author = {Kansallisarkisto},
116
 
117
  title = {Multicentury HTR Model: Handwritten Text Recognition},
118
 
119
- year = {2024},
120
 
121
  publisher = {Hugging Face},
122
 
@@ -127,4 +129,4 @@ If you use this model in your work, please cite it as:
127
  ## Model Card Authors
128
 
129
  Author: Kansallisarkisto
130
- Contact Information: riikka.marttila@kansallisarkisto.fi, ilkka.jokipii@kansallisarkisto.fi
 
6
  metrics:
7
  - cer
8
  pipeline_tag: image-to-text
9
+ base_model:
10
+ - microsoft/trocr-large-handwritten
11
  ---
12
  # Model description
13
 
 
23
 
24
  **License:** Apache 2.0
25
 
26
+ This model is a fine-tuned version of the microsoft/trocr-large-handwritten model, specialized for recognizing handwritten text. It has been trained on various dataset from 16th to 20th centuries and can be used for applications such as document digitization, form recognition, or any task involving handwritten text extraction.
27
 
28
  # Model Architecture
29
 
 
41
 
42
  # Training data
43
 
44
+ The training dataset includes more than 913 000 samples of handwritten and typewritten text rows, covering a wide variety of handwriting styles and text samples.
45
 
46
  # Evaluation
47
 
48
  The model was evaluated on test dataset. Below are key metrics:
49
 
50
+ **Character Error Rate (CER):** 2.8
51
 
52
+ **Test Dataset Description:** size ~111 800 text rows
53
 
54
  # Used Hyperparameters
55
 
 
57
 
58
  **Train batch size per device:** 16
59
 
60
+ **Learning rate:** 12.2e-5
61
 
62
+ **Scheduler:** polynomial
 
 
63
 
64
  **Optimizer:** AdamW
65
 
 
69
 
70
  **Half precision backend:** cuda_amp
71
 
72
+ **Input image size:** 192 x 1024
73
+
74
 
75
  # How to Use the Model
76
 
 
112
 
113
  If you use this model in your work, please cite it as:
114
 
115
+ @misc{multicentury_htr_model_202509,
116
 
117
  author = {Kansallisarkisto},
118
 
119
  title = {Multicentury HTR Model: Handwritten Text Recognition},
120
 
121
+ year = {2025},
122
 
123
  publisher = {Hugging Face},
124
 
 
129
  ## Model Card Authors
130
 
131
  Author: Kansallisarkisto
132
+ Contact Information: mikko.lipsanen@kansallisarkisto.fi, ilkka.jokipii@kansallisarkisto.fi