Are Image features used in this LayoutLM based model?

by KshitizM - opened Jan 14, 2023

Hello,

I don't think the document image features are used anywhere here but Image is a non-Optional argument in the DocumentQuestionAnsweringPipeline here:
https://github.com/huggingface/transformers/blob/b2c863a3196150850d17548f25ee0575bccb8224/src/transformers/pipelines/document_question_answering.py#L188
I get that it maybe is needed for OCR(tesseract) but if I provide word_boxes and use a LayoutLM(v1) based model, Image features should have no use.

So just want to confirm if image features are actually being used in this LayoutLM(v1) based model?

Thanks :)

Impira org Jan 17, 2023

You can provide None for the images for LayoutLMv1, and the pipeline will succeed (as long as you provide word_boxes).

KshitizM changed discussion status to closed Jan 17, 2023

KshitizM changed discussion status to open Jan 20, 2023

KshitizM changed discussion status to closed Jan 20, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment