HURIDOCS
/

pdf-document-layout-analysis

Model card Files Files and versions

ali6parmak commited on Jan 22

Commit

c1a9148

·

verified ·

1 Parent(s): 79ad13f

Update README.md

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -42,14 +42,19 @@ Run the service:
 - With GPU support:
 ```
-docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.20
 ```
 - Without GPU support:
 ```
-docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.20
 ```
 Get the segments from a PDF:
     curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
@@ -77,6 +82,12 @@ Start the service:
     make start
 Get the segments from a PDF:
     curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
@@ -110,6 +121,9 @@ Even though the visual model using more resources than the others, generally it'
 "sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
 but they are much faster and more resource-friendly. It will only require your CPU power.
 ## Data
 As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.

 - With GPU support:
 ```
+docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
 ```
 - Without GPU support:
 ```
+docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
 ```
+[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
+    curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
 Get the segments from a PDF:
     curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
     make start
+[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
+    curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
 Get the segments from a PDF:
     curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
 "sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
 but they are much faster and more resource-friendly. It will only require your CPU power.
+The service converts PDFs to text-searchable PDFs using [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) and [ocrmypdf](https://ocrmypdf.readthedocs.io/en/latest/index.html).
 ## Data
 As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.