File size: 3,618 Bytes
4549518 8f7512e 064b20a 8f7512e 064b20a 8f7512e 064b20a 8f7512e 24e77e1 8f7512e 24e77e1 0a7748e 24e77e1 d2f7897 10a0521 d2f7897 8f7512e 24e77e1 2f6af40 24e77e1 8f7512e 24e77e1 2f6af40 24e77e1 b179d75 8f7512e 24e77e1 8f7512e 24e77e1 2f6af40 24e77e1 3d8e705 24e77e1 3d8e705 24e77e1 8f7512e d2f7897 2f6af40 8f7512e 2f6af40 24e77e1 8f7512e 5b9cbe8 24e77e1 8f7512e 24e77e1 8f7512e 24e77e1 8f7512e 24e77e1 8f7512e 24e77e1 8f7512e 24e77e1 2f6af40 8f7512e 2f6af40 24e77e1 d2f7897 24e77e1 d2f7897 24e77e1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | ---
license: apache-2.0
tags:
- object-detection
- document-layout-analysis
- historical-documents
- layoutparser
- detectron2
- mask-rcnn
language:
- sv
pipeline_tag: object-detection
base_model:
- layoutparser/detectron2
---
# Historical Document Layout Detection Model
A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout elements in historical Swedish medical journal pages. This is a lighter model and the more complex model can be found here [Swemper-layout](https://huggingface.co/cdhu-uu/SweMPer-layout).
This model was developed as part of the research project:
**Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011**
(Project ID: **IN22-0017**), funded by **Riksbankens Jubileumsfond**.
## Project page:
https://www.uu.se/en/department/history-of-science-and-ideas/research/research-projects-and-programmes/communicating-medicine-swemper
## Model Details
- **Model type:** Mask R-CNN (ResNet backbone)
- **Framework:** Detectron2 / LayoutParser
- **Fine-tuned for:** Historical document layout analysis
- **Language of source documents:** Swedish
## Label Map
| ID | Label |
|----|------------------|
| 0 | Advertisement |
| 1 | Author |
| 2 | Header or Footer |
| 3 | Image |
| 4 | List |
| 5 | Page Number |
| 6 | Table |
| 7 | Text |
| 8 | Title |
## Evaluation Metrics
The evaluation metrics for this model are as follows:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 64.325 | 88.948 | 69.214 | 40.350 | 55.117 | 67.543 |
## Usage
### Installation
Follow instructions at:
https://detectron2.readthedocs.io/en/latest/tutorials/install.html
### Finetuning
Follow instructions at:
https://detectron2.readthedocs.io/en/latest/tutorials/training.html
### Inference
```python
import cv2
import layoutparser as lp
import matplotlib.pyplot as plt
# Configuration
model_config_path = "config_mask_rcnn_resized.yaml"
model_path = "SweMPer-layout-lite.pth"
label_map = {
0: "advertisement",
1: "author",
2: "header_or_footer",
3: "image",
4: "list",
5: "page_no",
6: "table",
7: "text",
8: "title",
}
# Load model
model = lp.models.Detectron2LayoutModel(
config_path=model_config_path,
model_path=model_path,
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map=label_map,
)
# Load and process image
image = cv2.imread("<path_to_image>")
image = image[..., ::-1] # BGR to RGB
# Detect layout
layout = model.detect(image)
# Print detected elements
for block in layout:
print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
# Visualize results
viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
plt.figure(figsize=(12, 16))
plt.imshow(viz)
plt.axis("off")
plt.show()
```
## Acknowledgements
This work was carried out within the project:
**Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011**
(Project ID: **IN22-0017**), funded by **Riksbankens Jubileumsfond**.
We gratefully acknowledge the support of the funder and project collaborators.
This model builds upon the excellent work of:
- [Detectron2](https://github.com/facebookresearch/detectron2/tree/main)
- [LayoutParser](https://github.com/Layout-Parser/layout-parser?tab=readme-ov-file)
We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research. |