cdhu-uu
/

SweMPer-layout-lite

Object Detection

document-layout-analysis

historical-documents

Model card Files Files and versions

SweMPer-layout-lite / README.md

sushruthb's picture

Update README.md

b179d75 verified 10 days ago

|

history blame contribute delete

3.62 kB

	---
	license: apache-2.0
	tags:
	- object-detection
	- document-layout-analysis
	- historical-documents
	- layoutparser
	- detectron2
	- mask-rcnn
	language:
	- sv
	pipeline_tag: object-detection
	base_model:
	- layoutparser/detectron2
	---

	# Historical Document Layout Detection Model

	A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout elements in historical Swedish medical journal pages. This is a lighter model and the more complex model can be found here [Swemper-layout](https://huggingface.co/cdhu-uu/SweMPer-layout).

	This model was developed as part of the research project:
	Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011
	(Project ID: IN22-0017), funded by Riksbankens Jubileumsfond.

	## Project page:
	https://www.uu.se/en/department/history-of-science-and-ideas/research/research-projects-and-programmes/communicating-medicine-swemper

	## Model Details

	- Model type: Mask R-CNN (ResNet backbone)
	- Framework: Detectron2 / LayoutParser
	- Fine-tuned for: Historical document layout analysis
	- Language of source documents: Swedish

	## Label Map

	\| ID \| Label \|
	\|----\|------------------\|
	\| 0 \| Advertisement \|
	\| 1 \| Author \|
	\| 2 \| Header or Footer \|
	\| 3 \| Image \|
	\| 4 \| List \|
	\| 5 \| Page Number \|
	\| 6 \| Table \|
	\| 7 \| Text \|
	\| 8 \| Title \|

	## Evaluation Metrics
	The evaluation metrics for this model are as follows:
	\| AP \| AP50 \| AP75 \| APs \| APm \| APl \|
	\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|
	\| 64.325 \| 88.948 \| 69.214 \| 40.350 \| 55.117 \| 67.543 \|

	## Usage

	### Installation

	Follow instructions at:
	https://detectron2.readthedocs.io/en/latest/tutorials/install.html

	### Finetuning

	Follow instructions at:
	https://detectron2.readthedocs.io/en/latest/tutorials/training.html

	### Inference

	```python
	import cv2
	import layoutparser as lp
	import matplotlib.pyplot as plt

	# Configuration
	model_config_path = "config_mask_rcnn_resized.yaml"
	model_path = "SweMPer-layout-lite.pth"

	label_map = {
	0: "advertisement",
	1: "author",
	2: "header_or_footer",
	3: "image",
	4: "list",
	5: "page_no",
	6: "table",
	7: "text",
	8: "title",
	}

	# Load model
	model = lp.models.Detectron2LayoutModel(
	config_path=model_config_path,
	model_path=model_path,
	extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
	label_map=label_map,
	)

	# Load and process image
	image = cv2.imread("<path_to_image>")
	image = image[..., ::-1] # BGR to RGB

	# Detect layout
	layout = model.detect(image)

	# Print detected elements
	for block in layout:
	print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")

	# Visualize results
	viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
	plt.figure(figsize=(12, 16))
	plt.imshow(viz)
	plt.axis("off")
	plt.show()
	```

	## Acknowledgements

	This work was carried out within the project:
	Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011
	(Project ID: IN22-0017), funded by Riksbankens Jubileumsfond.

	We gratefully acknowledge the support of the funder and project collaborators.

	This model builds upon the excellent work of:

	- [Detectron2](https://github.com/facebookresearch/detectron2/tree/main)
	- [LayoutParser](https://github.com/Layout-Parser/layout-parser?tab=readme-ov-file)

	We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research.