Vik Paruchuri commited on
Commit
e006e5a
·
1 Parent(s): 0e97894

Add documentation for LLM mode

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -9,6 +9,7 @@ Marker converts PDFs to markdown, JSON, and HTML quickly and accurately.
9
  - Extracts and saves images along with the markdown
10
  - Converts equations to latex
11
  - Easily extensible with your own formatting and logic
 
12
  - Works on GPU, CPU, or MPS
13
 
14
  ## How it works
@@ -99,10 +100,11 @@ marker_single /path/to/file.pdf
99
 
100
  Options:
101
  - `--output_dir PATH`: Directory where output files will be saved. Defaults to the value specified in settings.OUTPUT_DIR.
102
- - `--debug`: Enable debug mode for additional logging and diagnostic information.
103
  - `--output_format [markdown|json|html]`: Specify the format for the output results.
 
104
  - `--page_range TEXT`: Specify which pages to process. Accepts comma-separated page numbers and ranges. Example: `--page_range "0,5-10,20"` will process pages 0, 5 through 10, and page 20.
105
  - `--force_ocr`: Force OCR processing on the entire document, even for pages that might contain extractable text.
 
106
  - `--processors TEXT`: Override the default processors by providing their full module paths, separated by commas. Example: `--processors "module1.processor1,module2.processor2"`
107
  - `--config_json PATH`: Path to a JSON configuration file containing additional settings.
108
  - `--languages TEXT`: Optionally specify which languages to use for OCR processing. Accepts a comma-separated list. Example: `--languages "eng,fra,deu"` for English, French, and German.
@@ -127,7 +129,6 @@ NUM_DEVICES=4 NUM_WORKERS=15 marker_chunk_convert ../pdf_in ../md_out
127
 
128
  - `NUM_DEVICES` is the number of GPUs to use. Should be `2` or greater.
129
  - `NUM_WORKERS` is the number of parallel processes to run on each GPU.
130
- -
131
 
132
  ## Use from python
133
 
@@ -332,6 +333,7 @@ Note that this is not a very robust API, and is only intended for small-scale us
332
 
333
  There are some settings that you may find useful if things aren't working the way you expect:
334
 
 
335
  - Make sure to set `force_ocr` if you see garbled text - this will re-OCR the document.
336
  - `TORCH_DEVICE` - set this to force marker to use a given torch device for inference.
337
  - If you're getting out of memory errors, decrease worker count. You can also try splitting up long PDFs into multiple files.
 
9
  - Extracts and saves images along with the markdown
10
  - Converts equations to latex
11
  - Easily extensible with your own formatting and logic
12
+ - Optionally boost accuracy with an LLM
13
  - Works on GPU, CPU, or MPS
14
 
15
  ## How it works
 
100
 
101
  Options:
102
  - `--output_dir PATH`: Directory where output files will be saved. Defaults to the value specified in settings.OUTPUT_DIR.
 
103
  - `--output_format [markdown|json|html]`: Specify the format for the output results.
104
+ - `--use_llm`: Uses an LLM to improve accuracy. You must set your Gemini API key using the `GOOGLE_API_KEY` env var.
105
  - `--page_range TEXT`: Specify which pages to process. Accepts comma-separated page numbers and ranges. Example: `--page_range "0,5-10,20"` will process pages 0, 5 through 10, and page 20.
106
  - `--force_ocr`: Force OCR processing on the entire document, even for pages that might contain extractable text.
107
+ - `--debug`: Enable debug mode for additional logging and diagnostic information.
108
  - `--processors TEXT`: Override the default processors by providing their full module paths, separated by commas. Example: `--processors "module1.processor1,module2.processor2"`
109
  - `--config_json PATH`: Path to a JSON configuration file containing additional settings.
110
  - `--languages TEXT`: Optionally specify which languages to use for OCR processing. Accepts a comma-separated list. Example: `--languages "eng,fra,deu"` for English, French, and German.
 
129
 
130
  - `NUM_DEVICES` is the number of GPUs to use. Should be `2` or greater.
131
  - `NUM_WORKERS` is the number of parallel processes to run on each GPU.
 
132
 
133
  ## Use from python
134
 
 
333
 
334
  There are some settings that you may find useful if things aren't working the way you expect:
335
 
336
+ - If you have issues with accuracy, try setting `--use_llm` to use an LLM to improve quality. You must set `GOOGLE_API_KEY` to a Gemini API key for this to work.
337
  - Make sure to set `force_ocr` if you see garbled text - this will re-OCR the document.
338
  - `TORCH_DEVICE` - set this to force marker to use a given torch device for inference.
339
  - If you're getting out of memory errors, decrease worker count. You can also try splitting up long PDFs into multiple files.