adelevett commited on
Commit
dbe48bf
Β·
verified Β·
1 Parent(s): 258fdc9

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +44 -7
  2. app.py +55 -0
  3. requirements.txt +2 -0
README.md CHANGED
@@ -1,15 +1,52 @@
1
  ---
2
- title: Docling Layout Demo
3
- emoji: πŸ“š
4
- colorFrom: pink
5
- colorTo: gray
6
  sdk: gradio
7
  sdk_version: 6.9.0
8
- python_version: '3.12'
9
  app_file: app.py
10
  pinned: false
11
  license: mit
12
- short_description: docling-pp-doc-layout based document conversion demo
13
  ---
14
 
15
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: PP-DocLayoutV3 Empirical Parser
3
+ emoji: πŸ“„
4
+ colorFrom: blue
5
+ colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 6.9.0
 
8
  app_file: app.py
9
  pinned: false
10
  license: mit
 
11
  ---
12
 
13
+ # PP-DocLayoutV3 Pipeline: Empirical Iteration Guide
14
+
15
+ This application provides an extraction pipeline using `docling-pp-doc-layout`
16
+ running on Hugging Face's ZeroGPU infrastructure (70 GB VRAM NVIDIA H200).
17
+ Because instance-segmentation-based layout parsing exhibits high variance in
18
+ memory utilisation based on polygon density and image resolution, this Space is
19
+ engineered for iterative, data-driven optimisation.
20
+
21
+ ## Architecture
22
+
23
+ | Component | Value |
24
+ |---|---|
25
+ | Hardware | Hugging Face ZeroGPU (`@spaces.GPU`, large tier β€” half H200) |
26
+ | SDK | Gradio 6.9.0 |
27
+ | Python | 3.12 (ZeroGPU supports 3.12.12 and 3.10.13; 3.13 is **not** supported) |
28
+ | Layout model | `PaddlePaddle/PP-DocLayoutV3_safetensors` |
29
+ | GPU timeout | 120 s (`duration=120`) |
30
+
31
+ ## Iterative Deployment Protocol
32
+
33
+ ### 1. Memory Profiling and Batch Optimisation
34
+
35
+ `PPDocLayoutV3Options` is initialised with `batch_size=2` as a conservative
36
+ baseline. Monitor ZeroGPU hardware logs for OOM evictions. The large tier
37
+ provides 70 GB VRAM, so `batch_size` can be incremented sequentially until
38
+ utilisation approaches the ceiling.
39
+
40
+ ### 2. Confidence Threshold Calibration
41
+
42
+ `confidence_threshold=0.5` is the default decision boundary. Evaluate output
43
+ classifications against a validation set:
44
+
45
+ - **Higher threshold** β†’ higher precision, fewer false positives
46
+ - **Lower threshold** β†’ higher recall, fewer missed bounding boxes
47
+
48
+ ### 3. Queue Latency and Hardware Timeouts
49
+
50
+ ZeroGPU enforces a 60 s default GPU lease. The `@spaces.GPU(duration=120)`
51
+ annotation extends this to 120 s. If empirical data shows consistent sub-60 s
52
+ inference, reduce `duration` to improve queue priority for Space visitors.
app.py ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import spaces
3
+ from docling.datamodel.base_models import InputFormat
4
+ from docling.document_converter import DocumentConverter, PdfFormatOption
5
+ from docling.datamodel.pipeline_options import PdfPipelineOptions
6
+ from docling_pp_doc_layout.options import PPDocLayoutV3Options
7
+
8
+ # Global initialisation β€” pipeline is constructed lazily on the first
9
+ # convert() call, which happens inside @spaces.GPU, so decide_device()
10
+ # correctly resolves "cuda:0" when the H200 is allocated.
11
+ pipeline_options = PdfPipelineOptions(
12
+ layout_options=PPDocLayoutV3Options(
13
+ batch_size=2,
14
+ confidence_threshold=0.5,
15
+ )
16
+ )
17
+
18
+ converter = DocumentConverter(
19
+ format_options={
20
+ InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
21
+ }
22
+ )
23
+
24
+
25
+ @spaces.GPU(duration=120)
26
+ def infer_layout(file_path: str | None):
27
+ if not file_path:
28
+ return {"error": "No file uploaded"}
29
+ try:
30
+ result = converter.convert(file_path)
31
+ structured_data = []
32
+ for item, _level in result.document.iterate_items():
33
+ structured_data.append({
34
+ "type": type(item).__name__,
35
+ "content": getattr(item, "text", "No text mapping"),
36
+ })
37
+ return structured_data
38
+ except Exception as e:
39
+ return {"runtime_exception": str(e)}
40
+
41
+
42
+ with gr.Blocks(title="PP-DocLayoutV3 Empirical Parser") as interface:
43
+ gr.Markdown(
44
+ "## Layout Detection Inference\n"
45
+ "Upload a PDF to parse structural components through the "
46
+ "PaddlePaddle PP-DocLayoutV3 model."
47
+ )
48
+ with gr.Row():
49
+ pdf_input = gr.File(label="Source Document", file_types=[".pdf"])
50
+ json_output = gr.JSON(label="Structured Extraction Matrix")
51
+ execute_btn = gr.Button("Initialize Inference")
52
+ execute_btn.click(fn=infer_layout, inputs=pdf_input, outputs=json_output)
53
+
54
+ if __name__ == "__main__":
55
+ interface.launch()
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ docling-pp-doc-layout
2
+ spaces