cdhu-uu
/

SweMPer-layout-lite

@@ -11,15 +11,21 @@ language:
   - sv
 pipeline_tag: object-detection
 ---
 # Historical Document Layout Detection Model
 A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout
 elements in historical Swedish medical journal pages.
 ## Model Details
 - **Model type:** Mask R-CNN (ResNet backbone)
 - **Framework:** Detectron2 / LayoutParser
 - **Fine-tuned for:** Historical document layout analysis
 - **Language of source documents:** Swedish
 ## Label Map
 | ID | Label            |
 |----|------------------|
 | 0  | Advertisement    |
@@ -31,21 +37,30 @@ elements in historical Swedish medical journal pages.
 | 6  | Table            |
 | 7  | Text             |
 | 8  | Title            |
 ## Usage
 ### Installation
 Follow instructions at:
 https://detectron2.readthedocs.io/en/latest/tutorials/install.html
 ### Finetuning
-Follow instructions at:
 https://detectron2.readthedocs.io/en/latest/tutorials/training.html
 ### Inference
 ```python
 import cv2
 import layoutparser as lp
 import matplotlib.pyplot as plt
 # Configuration
 model_config_path = "config_mask_rcnn_resized.yaml"
 model_path = "model_final_LP.pth"
 label_map = {
     0: "advertisement",
     1: "author",
@@ -57,6 +72,7 @@ label_map = {
     7: "text",
     8: "title",
 }
 # Load model
 model = lp.models.Detectron2LayoutModel(
     config_path=model_config_path,
@@ -64,18 +80,31 @@ model = lp.models.Detectron2LayoutModel(
     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
     label_map=label_map,
 )
 # Load and process image
 image = cv2.imread("<path_to_image>")
 image = image[..., ::-1]  # BGR to RGB
 # Detect layout
 layout = model.detect(image)
 # Print detected elements
 for block in layout:
     print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
 # Visualize results
 viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
 plt.figure(figsize=(12, 16))
 plt.imshow(viz)
 plt.axis("off")
 plt.show()
-```

   - sv
 pipeline_tag: object-detection
 ---
 # Historical Document Layout Detection Model
 A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout
 elements in historical Swedish medical journal pages.
 ## Model Details
 - **Model type:** Mask R-CNN (ResNet backbone)
 - **Framework:** Detectron2 / LayoutParser
 - **Fine-tuned for:** Historical document layout analysis
 - **Language of source documents:** Swedish
 ## Label Map
 | ID | Label            |
 |----|------------------|
 | 0  | Advertisement    |
 | 6  | Table            |
 | 7  | Text             |
 | 8  | Title            |
 ## Usage
 ### Installation
 Follow instructions at:
 https://detectron2.readthedocs.io/en/latest/tutorials/install.html
 ### Finetuning
+Follow instructions at:
 https://detectron2.readthedocs.io/en/latest/tutorials/training.html
 ### Inference
 ```python
 import cv2
 import layoutparser as lp
 import matplotlib.pyplot as plt
 # Configuration
 model_config_path = "config_mask_rcnn_resized.yaml"
 model_path = "model_final_LP.pth"
 label_map = {
     0: "advertisement",
     1: "author",
     7: "text",
     8: "title",
 }
 # Load model
 model = lp.models.Detectron2LayoutModel(
     config_path=model_config_path,
     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
     label_map=label_map,
 )
 # Load and process image
 image = cv2.imread("<path_to_image>")
 image = image[..., ::-1]  # BGR to RGB
 # Detect layout
 layout = model.detect(image)
 # Print detected elements
 for block in layout:
     print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
 # Visualize results
 viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
 plt.figure(figsize=(12, 16))
 plt.imshow(viz)
 plt.axis("off")
 plt.show()
+```
+## Acknowledgements
+This model builds upon the excellent work of:
+- [Detectron2](https://github.com/facebookresearch/detectron2/tree/main)
+- [LayoutParser](https://github.com/Layout-Parser/layout-parser?tab=readme-ov-file)
+We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research.