Instructions to use seungwon12/layoutlm-document-extract with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use seungwon12/layoutlm-document-extract with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="seungwon12/layoutlm-document-extract")# Load model directly from transformers import AutoProcessor, AutoModelForTokenClassification processor = AutoProcessor.from_pretrained("seungwon12/layoutlm-document-extract") model = AutoModelForTokenClassification.from_pretrained("seungwon12/layoutlm-document-extract") - Notebooks
- Google Colab
- Kaggle
Document extract
This model is layoutlmv2 base model
if you want to use this model then you have to preprocessing the data to use this model.(use LayoutLMv2Processor models)
Process
- I used Korean language invoice document image data to training this model
- Use Naver Clova service for extract text data from images
- Determining text Label(target) for each text box
- Combining the image text, bounding box position data, Label
- And use LayoutLMv2Processor models for encoding the data
- Do prediction for encoded data to this model
- Downloads last month
- 4