| language: | |
| - en | |
| license: cc-by-nc-sa-4.0 | |
| library_name: transformers | |
| tags: | |
| - finance | |
| metrics: | |
| - accuracy | |
| base_model: microsoft/layoutlmv3-base | |
| ## Model | |
| This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) trained on [Financial Documents Clustering Kaggle Dataset](https://www.kaggle.com/datasets/drcrabkg/financial-statements-clustering). | |
| It classifies document images into one of the following (5) classes: | |
| - Income Statements | |
| - Balance Sheets | |
| - Cash Flows | |
| - Notes | |
| - Others | |
| ## Training | |
| This model uses OCR data from [EasyOCR](https://github.com/JaidedAI/EasyOCR) instead of the default Tesseract OCR engine. | |
| ## Libraries | |
| - transformers 4.25.1 | |
| - pytorch-lightning 1.8.6 | |
| - torchmetrics 0.11.0 | |
| - easyocr 1.6.2 |