| | --- |
| | title: KDDA Global Model - Invoices |
| | emoji: 🐨 |
| | --- |
| | |
| | # Configuration |
| |
|
| | `title`: _string_ |
| | Display title for the Space |
| |
|
| | `emoji`: _string_ |
| | Space emoji (emoji-only character allowed) |
| |
|
| | `colorFrom`: _string_ |
| | Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray) |
| |
|
| | `colorTo`: _string_ |
| | Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray) |
| |
|
| | `sdk`: _string_ |
| | Can be either `gradio` or `streamlit` |
| |
|
| | `app_file`: _string_ |
| | Path to your main application file (which contains either `gradio` or `streamlit` Python code). |
| | Path is relative to the root of the repository. |
| |
|
| | `pinned`: _boolean_ |
| | Whether the Space stays on top of your list. |
| |
|
| | # Custom LayoutLM Model for Invoice Processing |
| |
|
| | This repository hosts a custom implementation of the [LayoutLM](https://huggingface.co/microsoft/layoutlm-base-uncased) model, specifically fine-tuned for extracting key information from invoices. The model is designed to identify and extract various fields such as amounts, dates, and names from invoice documents. |
| |
|
| | ## Model Overview |
| |
|
| | This model is based on the LayoutLMv2 architecture and has been fine-tuned on a custom dataset of invoices. It is capable of performing token classification to extract the following entities: |
| |
|
| | - **Amount Including Tax** |
| | - **Due Date** |
| | - **Reference Number** |
| | - **Customer Name** |
| | - **Vendor Name** |
| | - **Issue Date** |
| | - **Amount** |
| |
|
| | The model uses a custom set of labels to identify and classify these entities within the invoice documents. |
| |
|
| | ## Label Mapping |
| |
|
| | The model has been trained with the following `label2id` and `id2label` mappings: |
| |
|
| | ### `label2id` Mapping |
| |
|
| | ```json |
| | label2id = { |
| | 'I-Customer Name': 0, |
| | 'B-Issue Date': 1, |
| | 'I-Issue Date': 2, |
| | 'I-Due Date': 3, |
| | 'I-Amount': 4, |
| | 'B-Due Date': 5, |
| | 'O': 6, |
| | 'B-Amount Including tax': 7, |
| | 'B-Customer Name': 8, |
| | 'B-Amount': 9, |
| | 'I-Amount Including tax': 10, |
| | 'B-Vendor Name': 11, |
| | 'I-Vendor Name': 12, |
| | 'I-Reference Number': 13, |
| | 'B-Reference Number': 14 |
| | } |
| | id2label = { |
| | 0: 'I-Customer Name', |
| | 1: 'B-Issue Date', |
| | 2: 'I-Issue Date', |
| | 3: 'I-Due Date', |
| | 4: 'I-Amount', |
| | 5: 'B-Due Date', |
| | 6: 'O', |
| | 7: 'B-Amount Including tax', |
| | 8: 'B-Customer Name', |
| | 9: 'B-Amount', |
| | 10: 'I-Amount Including tax', |
| | 11: 'B-Vendor Name', |
| | 12: 'I-Vendor Name', |
| | 13: 'I-Reference Number', |
| | 14: 'B-Reference Number' |
| | } |
| | |
| | |
| | ## Citation |
| | @article{Xu2020LayoutLM, |
| | title={LayoutLM: Multi-modal Pre-training for Visually-Rich Document Understanding}, |
| | author={Yiheng Xu and Minghao Li and Lei Cui and Shaohan Huang and Furu Wei and Ming Zhou}, |
| | journal={ArXiv}, |
| | year={2020}, |
| | volume={abs/2012.14740} |
| | } |