Awarebeyond commited on
Commit
d92d333
·
verified ·
1 Parent(s): 292a9ba

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -8,19 +8,32 @@ tags:
8
  - receipt-extraction
9
  pipeline_tag: image-to-text
10
  widget:
11
- - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/receipt.jpg
12
- example_title: Sample Receipt
13
  ---
14
 
15
  # Receipt Donut (Fine-tuned Document UI)
16
 
17
  This model extracts structured JSON data directly from receipt images without needing a separate OCR engine. Fine-tuned on the `naver-clova-ix/donut-base-finetuned-cord-v2` base model.
18
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Model Details
20
  - **Architecture:** Donut (Document Understanding Transformer)
21
  - **Task:** Image-to-JSON extraction
22
  - **Extracted Fields:** `merchant`, `date`, `subtotal`, `tax`, `total`, `address`
23
  - **Training Data:** 8,615 heavily augmented receipt images sourced from 8 diverse public datasets (CORD, WildReceipts, SROIE variants, etc.)
 
24
 
25
  ## Try it out!
26
  Use the **Hosted Inference API** widget on the right.
 
8
  - receipt-extraction
9
  pipeline_tag: image-to-text
10
  widget:
11
+ - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/receipt.jpg
12
+ example_title: Sample Receipt
13
  ---
14
 
15
  # Receipt Donut (Fine-tuned Document UI)
16
 
17
  This model extracts structured JSON data directly from receipt images without needing a separate OCR engine. Fine-tuned on the `naver-clova-ix/donut-base-finetuned-cord-v2` base model.
18
 
19
+ ## Training Performance
20
+ The model was trained for 11 epochs on an NVIDIA L4 GPU. Optimal convergence was reached at Epoch 9.
21
+
22
+ ![Learning Curve](learning_curve.png)
23
+
24
+ ## Sample Extraction Results
25
+ Below are some examples of the model performing extraction on the validation set (Original Image vs. Model Output).
26
+
27
+ ![Sample 1](hub_assets/sample_result_0.png)
28
+ ![Sample 2](hub_assets/sample_result_1.png)
29
+ ![Sample 3](hub_assets/sample_result_2.png)
30
+
31
  ## Model Details
32
  - **Architecture:** Donut (Document Understanding Transformer)
33
  - **Task:** Image-to-JSON extraction
34
  - **Extracted Fields:** `merchant`, `date`, `subtotal`, `tax`, `total`, `address`
35
  - **Training Data:** 8,615 heavily augmented receipt images sourced from 8 diverse public datasets (CORD, WildReceipts, SROIE variants, etc.)
36
+ - **License:** MIT
37
 
38
  ## Try it out!
39
  Use the **Hosted Inference API** widget on the right.