Geremia Taglialatela
commited on
Commit
·
28a9583
1
Parent(s):
b375290
Fix link to Surya supported languages
Browse filesClicking on the current "supported languages" returns in a 404 error.
Slightly refers to #431
`CODE_TO_LANGUAGE` constant has been moved to a different location
in VikParuchuri/surya@1de6d91
- README.md +2 -2
- marker/scripts/server.py +1 -1
README.md
CHANGED
|
@@ -117,7 +117,7 @@ Options:
|
|
| 117 |
- `config --help`: List all available builders, processors, and converters, and their associated configuration. These values can be used to build a JSON configuration file for additional tweaking of marker defaults.
|
| 118 |
- `--converter_cls`: One of `marker.converters.pdf.PdfConverter` (default) or `marker.converters.table.TableConverter`. The `PdfConverter` will convert the whole PDF, the `TableConverter` will only extract and convert tables.
|
| 119 |
|
| 120 |
-
The list of supported languages for surya OCR is [here](https://github.com/VikParuchuri/surya/blob/master/surya/languages.py). If you don't need OCR, marker can work with any language.
|
| 121 |
|
| 122 |
## Convert multiple files
|
| 123 |
|
|
@@ -445,4 +445,4 @@ This work would not have been possible without amazing open source models and da
|
|
| 445 |
- Pypdfium2/pdfium
|
| 446 |
- DocLayNet from IBM
|
| 447 |
|
| 448 |
-
Thank you to the authors of these models and datasets for making them available to the community!
|
|
|
|
| 117 |
- `config --help`: List all available builders, processors, and converters, and their associated configuration. These values can be used to build a JSON configuration file for additional tweaking of marker defaults.
|
| 118 |
- `--converter_cls`: One of `marker.converters.pdf.PdfConverter` (default) or `marker.converters.table.TableConverter`. The `PdfConverter` will convert the whole PDF, the `TableConverter` will only extract and convert tables.
|
| 119 |
|
| 120 |
+
The list of supported languages for surya OCR is [here](https://github.com/VikParuchuri/surya/blob/master/surya/recognition/languages.py). If you don't need OCR, marker can work with any language.
|
| 121 |
|
| 122 |
## Convert multiple files
|
| 123 |
|
|
|
|
| 445 |
- Pypdfium2/pdfium
|
| 446 |
- DocLayNet from IBM
|
| 447 |
|
| 448 |
+
Thank you to the authors of these models and datasets for making them available to the community!
|
marker/scripts/server.py
CHANGED
|
@@ -62,7 +62,7 @@ class CommonParams(BaseModel):
|
|
| 62 |
] = None
|
| 63 |
languages: Annotated[
|
| 64 |
Optional[str],
|
| 65 |
-
Field(description="Comma separated list of languages to use for OCR. Must be either the names or codes from from https://github.com/VikParuchuri/surya/blob/master/surya/languages.py.", example=None)
|
| 66 |
] = None
|
| 67 |
force_ocr: Annotated[
|
| 68 |
bool,
|
|
|
|
| 62 |
] = None
|
| 63 |
languages: Annotated[
|
| 64 |
Optional[str],
|
| 65 |
+
Field(description="Comma separated list of languages to use for OCR. Must be either the names or codes from from https://github.com/VikParuchuri/surya/blob/master/surya/recognition/languages.py.", example=None)
|
| 66 |
] = None
|
| 67 |
force_ocr: Annotated[
|
| 68 |
bool,
|