---
library_name: transformers
license: cc-by-nc-sa-4.0
base_model: microsoft/layoutlmv3-base
tags:
- generated_from_trainer
- invoice-processing
- information-extraction
- czech-language
- document-ai
- layout-aware-model
- multimodal-model
- synthetic-data
- hybrid-data
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: LayoutLMv3InvoiceCzech-V2
  results: []
---

# LayoutLMv3InvoiceCzech (V2 – Synthetic + Random Layout + Real Layout Injection)

This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) for structured information extraction from Czech invoices.

It achieves the following results on the evaluation set:
- Loss: 0.0763  
- Precision: 0.8009  
- Recall: 0.8849  
- F1: 0.8408  
- Accuracy: 0.9844  

---

## Model description

LayoutLMv3InvoiceCzech (V2) represents an advanced multimodal document understanding model combining:

- textual features  
- spatial layout (bounding boxes)  
- visual features (image embeddings)  

The model performs token-level classification to extract structured invoice fields:
- supplier  
- customer  
- invoice number  
- bank details  
- totals  
- dates  

This version introduces **real layout injection**, significantly improving realism and generalization.

---

## Training data

The dataset consists of three components:

1. **Synthetic template-based invoices**  
2. **Synthetic invoices with randomized layouts**  
3. **Hybrid invoices with real layouts and synthetic content**  

### Real layout injection

In the hybrid dataset:
- real invoice layouts are used as templates  
- original text content is replaced with synthetic data  
- new content is rendered into authentic document structures  

This preserves:
- real-world spatial distributions  
- visual patterns and formatting  
- document complexity  

while maintaining:
- full annotation control  
- consistent labels  

---

## Role in the pipeline

This model corresponds to:

**V2 – Synthetic + layout augmentation + real layout injection**

It is used to:
- bridge the gap between synthetic and real-world data  
- evaluate the impact of realistic layouts on multimodal models  
- compare with:
  - V0–V1 (fully synthetic)  
  - V3 (real data fine-tuning)  

---

## Intended uses

- Advanced multimodal document AI  
- Invoice information extraction with visual + spatial features  
- Evaluation of hybrid data strategies  
- Benchmarking LayoutLMv3  

---

## Limitations

- Text content remains synthetic  
- Limited exposure to real linguistic variability  
- OCR noise and scanning artifacts are not fully represented  
- May struggle with rare real-world edge cases  

---

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 1
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP

---

### Training results

| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| No log        | 1.0   | 115  | 0.0725          | 0.7496    | 0.8257 | 0.7858 | 0.9807   |
| No log        | 2.0   | 230  | 0.0701          | 0.7569    | 0.8376 | 0.7952 | 0.9822   |
| No log        | 3.0   | 345  | 0.0735          | 0.7587    | 0.8883 | 0.8184 | 0.9810   |
| No log        | 4.0   | 460  | 0.0743          | 0.7827    | 0.8714 | 0.8247 | 0.9826   |
| 0.0606        | 5.0   | 575  | 0.0783          | 0.7756    | 0.8714 | 0.8207 | 0.9821   |
| 0.0606        | 6.0   | 690  | 0.0811          | 0.7561    | 0.8968 | 0.8204 | 0.9814   |
| 0.0606        | 7.0   | 805  | 0.0763          | 0.8009    | 0.8849 | 0.8408 | 0.9844   |
| 0.0606        | 8.0   | 920  | 0.0826          | 0.7784    | 0.9036 | 0.8363 | 0.9835   |
| 0.0201        | 9.0   | 1035 | 0.0824          | 0.7837    | 0.8951 | 0.8357 | 0.9836   |
| 0.0201        | 10.0  | 1150 | 0.0852          | 0.7818    | 0.9036 | 0.8383 | 0.9834   |

---

## Framework versions

- Transformers 5.0.0  
- PyTorch 2.10.0+cu128  
- Datasets 4.0.0  
- Tokenizers 0.22.2