Update README.md

Browse files

Files changed (1) hide show

README.md +93 -20

README.md CHANGED Viewed

@@ -4,40 +4,110 @@ license: cc-by-nc-sa-4.0
 base_model: microsoft/layoutlmv3-base
 tags:
 - generated_from_trainer
 metrics:
 - precision
 - recall
 - f1
 - accuracy
 model-index:
-- name: Layoutlmv3InvoiceCzech
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# Layoutlmv3InvoiceCzech
-This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2146
-- Precision: 0.5354
-- Recall: 0.7428
-- F1: 0.6223
-- Accuracy: 0.9583
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -54,6 +124,8 @@ The following hyperparameters were used during training:
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
@@ -69,10 +141,11 @@ The following hyperparameters were used during training:
 | 0.0360        | 9.0   | 1350 | 0.2141          | 0.5268    | 0.7327 | 0.6129 | 0.9578   |
 | 0.0147        | 10.0  | 1500 | 0.2131          | 0.5393    | 0.7310 | 0.6207 | 0.9597   |
-### Framework versions
-- Transformers 5.0.0
-- Pytorch 2.10.0+cu128
-- Datasets 4.0.0
-- Tokenizers 0.22.2

 base_model: microsoft/layoutlmv3-base
 tags:
 - generated_from_trainer
+- invoice-processing
+- information-extraction
+- czech-language
+- document-ai
+- layout-aware-model
+- multimodal-model
+- synthetic-data
 metrics:
 - precision
 - recall
 - f1
 - accuracy
 model-index:
+- name: LayoutLMv3InvoiceCzech-V0
   results: []
 ---
+# LayoutLMv3InvoiceCzech (V0 – Synthetic Templates Only)
+This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) for structured information extraction from Czech invoices.
 It achieves the following results on the evaluation set:
+- Loss: 0.2146
+- Precision: 0.5354
+- Recall: 0.7428
+- F1: 0.6223
+- Accuracy: 0.9583
+---
 ## Model description
+LayoutLMv3InvoiceCzech (V0) is a multimodal document understanding model that leverages:
+- textual information
+- spatial layout (bounding boxes)
+- visual features (image embeddings)
+The model performs token-level classification to extract structured invoice fields:
+- supplier
+- customer
+- invoice number
+- bank details
+- totals
+- dates
+This version is trained exclusively on synthetically generated invoice templates.
+---
+## Training data
+The dataset consists of:
+- synthetically generated invoices
+- fixed template layouts
+- corresponding bounding boxes
+- rendered document images
+Key properties:
+- consistent structure across samples
+- clean and noise-free data
+- perfect alignment between text, layout, and image
+- no real-world documents
+This represents the **baseline dataset** for multimodal document models.
+---
+## Role in the pipeline
+This model corresponds to:
+**V0 – Synthetic template-based dataset only**
+It is used to:
+- establish a baseline for multimodal models
+- compare against:
+  - text-only models (BERT)
+  - layout-aware models without vision (LiLT)
+- evaluate the contribution of visual features in a controlled setting
+---
+## Intended uses
+- Research in multimodal document understanding
+- Benchmarking LayoutLMv3 on structured documents
+- Comparison with other architectures (BERT, LiLT, etc.)
+- Czech invoice information extraction
+---
+## Limitations
+- Trained only on synthetic data with fixed layouts
+- Limited generalization to real-world invoices
+- Visual features are learned from clean synthetic renderings
+- No exposure to:
+  - OCR errors
+  - scanning artifacts
+  - real-world noise
+---
 ## Training procedure
 - num_epochs: 10
 - mixed_precision_training: Native AMP
+---
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
 | 0.0360        | 9.0   | 1350 | 0.2141          | 0.5268    | 0.7327 | 0.6129 | 0.9578   |
 | 0.0147        | 10.0  | 1500 | 0.2131          | 0.5393    | 0.7310 | 0.6207 | 0.9597   |
+---
+## Framework versions
+- Transformers 5.0.0
+- PyTorch 2.10.0+cu128
+- Datasets 4.0.0
+- Tokenizers 0.22.2