seungwon12
/

layoutlm-document-extract

Token Classification

Model card Files Files and versions

Document extract

This model is layoutlmv2 base model

if you want to use this model then you have to preprocessing the data to use this model.(use LayoutLMv2Processor models)

Process

I used Korean language invoice document image data to training this model
Use Naver Clova service for extract text data from images
Determining text Label(target) for each text box
Combining the image text, bounding box position data, Label
And use LayoutLMv2Processor models for encoding the data
Do prediction for encoded data to this model

Downloads last month: 17