| --- |
| license: apache-2.0 |
| tags: |
| - object-detection |
| - document-layout-analysis |
| - historical-documents |
| - layoutparser |
| - detectron2 |
| - mask-rcnn |
| language: |
| - sv |
| pipeline_tag: object-detection |
| base_model: |
| - layoutparser/detectron2 |
| --- |
| |
| # Historical Document Layout Detection Model |
|
|
| A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout elements in historical Swedish medical journal pages. This is a lighter model and the more complex model can be found here [Swemper-layout](https://huggingface.co/cdhu-uu/SweMPer-layout). |
|
|
| This model was developed as part of the research project: |
| **Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011** |
| (Project ID: **IN22-0017**), funded by **Riksbankens Jubileumsfond**. |
|
|
| ## Project page: |
| https://www.uu.se/en/department/history-of-science-and-ideas/research/research-projects-and-programmes/communicating-medicine-swemper |
|
|
| ## Model Details |
|
|
| - **Model type:** Mask R-CNN (ResNet backbone) |
| - **Framework:** Detectron2 / LayoutParser |
| - **Fine-tuned for:** Historical document layout analysis |
| - **Language of source documents:** Swedish |
|
|
| ## Label Map |
|
|
| | ID | Label | |
| |----|------------------| |
| | 0 | Advertisement | |
| | 1 | Author | |
| | 2 | Header or Footer | |
| | 3 | Image | |
| | 4 | List | |
| | 5 | Page Number | |
| | 6 | Table | |
| | 7 | Text | |
| | 8 | Title | |
|
|
| ## Evaluation Metrics |
| The evaluation metrics for this model are as follows: |
| | AP | AP50 | AP75 | APs | APm | APl | |
| |:------:|:------:|:------:|:------:|:------:|:------:| |
| | 64.325 | 88.948 | 69.214 | 40.350 | 55.117 | 67.543 | |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| Follow instructions at: |
| https://detectron2.readthedocs.io/en/latest/tutorials/install.html |
|
|
| ### Finetuning |
|
|
| Follow instructions at: |
| https://detectron2.readthedocs.io/en/latest/tutorials/training.html |
|
|
| ### Inference |
|
|
| ```python |
| import cv2 |
| import layoutparser as lp |
| import matplotlib.pyplot as plt |
| |
| # Configuration |
| model_config_path = "config_mask_rcnn_resized.yaml" |
| model_path = "SweMPer-layout-lite.pth" |
| |
| label_map = { |
| 0: "advertisement", |
| 1: "author", |
| 2: "header_or_footer", |
| 3: "image", |
| 4: "list", |
| 5: "page_no", |
| 6: "table", |
| 7: "text", |
| 8: "title", |
| } |
| |
| # Load model |
| model = lp.models.Detectron2LayoutModel( |
| config_path=model_config_path, |
| model_path=model_path, |
| extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8], |
| label_map=label_map, |
| ) |
| |
| # Load and process image |
| image = cv2.imread("<path_to_image>") |
| image = image[..., ::-1] # BGR to RGB |
| |
| # Detect layout |
| layout = model.detect(image) |
| |
| # Print detected elements |
| for block in layout: |
| print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}") |
| |
| # Visualize results |
| viz = lp.draw_box(image, layout, box_width=3, show_element_type=True) |
| plt.figure(figsize=(12, 16)) |
| plt.imshow(viz) |
| plt.axis("off") |
| plt.show() |
| ``` |
|
|
| ## Acknowledgements |
|
|
| This work was carried out within the project: |
| **Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011** |
| (Project ID: **IN22-0017**), funded by **Riksbankens Jubileumsfond**. |
|
|
| We gratefully acknowledge the support of the funder and project collaborators. |
|
|
| This model builds upon the excellent work of: |
|
|
| - [Detectron2](https://github.com/facebookresearch/detectron2/tree/main) |
| - [LayoutParser](https://github.com/Layout-Parser/layout-parser?tab=readme-ov-file) |
|
|
| We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research. |