Update README.md
Browse files
README.md
CHANGED
|
@@ -42,14 +42,19 @@ Run the service:
|
|
| 42 |
|
| 43 |
- With GPU support:
|
| 44 |
```
|
| 45 |
-
docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.
|
| 46 |
```
|
| 47 |
|
| 48 |
- Without GPU support:
|
| 49 |
```
|
| 50 |
-
docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.
|
| 51 |
```
|
| 52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
Get the segments from a PDF:
|
| 54 |
|
| 55 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
@@ -77,6 +82,12 @@ Start the service:
|
|
| 77 |
|
| 78 |
make start
|
| 79 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
Get the segments from a PDF:
|
| 81 |
|
| 82 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
@@ -110,6 +121,9 @@ Even though the visual model using more resources than the others, generally it'
|
|
| 110 |
"sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
|
| 111 |
but they are much faster and more resource-friendly. It will only require your CPU power.
|
| 112 |
|
|
|
|
|
|
|
|
|
|
| 113 |
## Data
|
| 114 |
|
| 115 |
As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.
|
|
|
|
| 42 |
|
| 43 |
- With GPU support:
|
| 44 |
```
|
| 45 |
+
docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
|
| 46 |
```
|
| 47 |
|
| 48 |
- Without GPU support:
|
| 49 |
```
|
| 50 |
+
docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
|
| 51 |
```
|
| 52 |
|
| 53 |
+
[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
|
| 54 |
+
|
| 55 |
+
curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
|
| 56 |
+
|
| 57 |
+
|
| 58 |
Get the segments from a PDF:
|
| 59 |
|
| 60 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
|
|
| 82 |
|
| 83 |
make start
|
| 84 |
|
| 85 |
+
|
| 86 |
+
[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
|
| 87 |
+
|
| 88 |
+
curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
|
| 89 |
+
|
| 90 |
+
|
| 91 |
Get the segments from a PDF:
|
| 92 |
|
| 93 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
|
|
| 121 |
"sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
|
| 122 |
but they are much faster and more resource-friendly. It will only require your CPU power.
|
| 123 |
|
| 124 |
+
The service converts PDFs to text-searchable PDFs using [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) and [ocrmypdf](https://ocrmypdf.readthedocs.io/en/latest/index.html).
|
| 125 |
+
|
| 126 |
+
|
| 127 |
## Data
|
| 128 |
|
| 129 |
As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.
|