sushruthb commited on
Commit
24e77e1
·
verified ·
1 Parent(s): 3d8e705

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -2
README.md CHANGED
@@ -11,15 +11,21 @@ language:
11
  - sv
12
  pipeline_tag: object-detection
13
  ---
 
14
  # Historical Document Layout Detection Model
 
15
  A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout
16
  elements in historical Swedish medical journal pages.
 
17
  ## Model Details
 
18
  - **Model type:** Mask R-CNN (ResNet backbone)
19
  - **Framework:** Detectron2 / LayoutParser
20
  - **Fine-tuned for:** Historical document layout analysis
21
  - **Language of source documents:** Swedish
 
22
  ## Label Map
 
23
  | ID | Label |
24
  |----|------------------|
25
  | 0 | Advertisement |
@@ -31,21 +37,30 @@ elements in historical Swedish medical journal pages.
31
  | 6 | Table |
32
  | 7 | Text |
33
  | 8 | Title |
 
34
  ## Usage
 
35
  ### Installation
 
36
  Follow instructions at:
37
  https://detectron2.readthedocs.io/en/latest/tutorials/install.html
 
38
  ### Finetuning
39
- Follow instructions at:
 
40
  https://detectron2.readthedocs.io/en/latest/tutorials/training.html
 
41
  ### Inference
 
42
  ```python
43
  import cv2
44
  import layoutparser as lp
45
  import matplotlib.pyplot as plt
 
46
  # Configuration
47
  model_config_path = "config_mask_rcnn_resized.yaml"
48
  model_path = "model_final_LP.pth"
 
49
  label_map = {
50
  0: "advertisement",
51
  1: "author",
@@ -57,6 +72,7 @@ label_map = {
57
  7: "text",
58
  8: "title",
59
  }
 
60
  # Load model
61
  model = lp.models.Detectron2LayoutModel(
62
  config_path=model_config_path,
@@ -64,18 +80,31 @@ model = lp.models.Detectron2LayoutModel(
64
  extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
65
  label_map=label_map,
66
  )
 
67
  # Load and process image
68
  image = cv2.imread("<path_to_image>")
69
  image = image[..., ::-1] # BGR to RGB
 
70
  # Detect layout
71
  layout = model.detect(image)
 
72
  # Print detected elements
73
  for block in layout:
74
  print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
 
75
  # Visualize results
76
  viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
77
  plt.figure(figsize=(12, 16))
78
  plt.imshow(viz)
79
  plt.axis("off")
80
  plt.show()
81
- ```
 
 
 
 
 
 
 
 
 
 
11
  - sv
12
  pipeline_tag: object-detection
13
  ---
14
+
15
  # Historical Document Layout Detection Model
16
+
17
  A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout
18
  elements in historical Swedish medical journal pages.
19
+
20
  ## Model Details
21
+
22
  - **Model type:** Mask R-CNN (ResNet backbone)
23
  - **Framework:** Detectron2 / LayoutParser
24
  - **Fine-tuned for:** Historical document layout analysis
25
  - **Language of source documents:** Swedish
26
+
27
  ## Label Map
28
+
29
  | ID | Label |
30
  |----|------------------|
31
  | 0 | Advertisement |
 
37
  | 6 | Table |
38
  | 7 | Text |
39
  | 8 | Title |
40
+
41
  ## Usage
42
+
43
  ### Installation
44
+
45
  Follow instructions at:
46
  https://detectron2.readthedocs.io/en/latest/tutorials/install.html
47
+
48
  ### Finetuning
49
+
50
+ Follow instructions at:
51
  https://detectron2.readthedocs.io/en/latest/tutorials/training.html
52
+
53
  ### Inference
54
+
55
  ```python
56
  import cv2
57
  import layoutparser as lp
58
  import matplotlib.pyplot as plt
59
+
60
  # Configuration
61
  model_config_path = "config_mask_rcnn_resized.yaml"
62
  model_path = "model_final_LP.pth"
63
+
64
  label_map = {
65
  0: "advertisement",
66
  1: "author",
 
72
  7: "text",
73
  8: "title",
74
  }
75
+
76
  # Load model
77
  model = lp.models.Detectron2LayoutModel(
78
  config_path=model_config_path,
 
80
  extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
81
  label_map=label_map,
82
  )
83
+
84
  # Load and process image
85
  image = cv2.imread("<path_to_image>")
86
  image = image[..., ::-1] # BGR to RGB
87
+
88
  # Detect layout
89
  layout = model.detect(image)
90
+
91
  # Print detected elements
92
  for block in layout:
93
  print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
94
+
95
  # Visualize results
96
  viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
97
  plt.figure(figsize=(12, 16))
98
  plt.imshow(viz)
99
  plt.axis("off")
100
  plt.show()
101
+ ```
102
+
103
+ ## Acknowledgements
104
+
105
+ This model builds upon the excellent work of:
106
+
107
+ - [Detectron2](https://github.com/facebookresearch/detectron2/tree/main)
108
+ - [LayoutParser](https://github.com/Layout-Parser/layout-parser?tab=readme-ov-file)
109
+
110
+ We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research.