llm_document / README.md
Alfonso Velasco
adding readme
c9f61a8
metadata
title: LayoutLMv3 Document Extraction
emoji: 📄
colorFrom: blue
colorTo: green
sdk: docker
app_file: app.py
pinned: false

LayoutLMv3 Document Extraction with OCR

This Space provides document extraction with bounding boxes using LayoutLMv3 and Tesseract OCR.

Features

  • PDF processing with OCR
  • Image text extraction
  • Bounding box coordinates for each text element
  • Multi-page PDF support

API Usage

Send POST requests to /extract with base64-encoded PDF or image: ```python