Spaces:
Running
Running
updated model to extract bank_name and cheque_date
Browse files
app.py
CHANGED
|
@@ -12,18 +12,21 @@ demo = gr.Blocks()
|
|
| 12 |
with demo:
|
| 13 |
|
| 14 |
gr.Markdown("# **<p align='center'>ChequeEasy: Banking with Transformers </p>**")
|
| 15 |
-
gr.Markdown("ChequeEasy is a project that aims to simplify the process of approval of cheques
|
| 16 |
-
This project leverages Donut model proposed in
|
| 17 |
-
"Donut is based on a very simple transformer encoder and decoder architecture. It's main USP is that it is an OCR-free approach to information extraction
|
| 18 |
-
OCR based techniques come with several limitations such as use of additional downstream models, lack of understanding about document structure, use of hand crafted rules,etc. \
|
| 19 |
-
Donut helps you get rid of all of these OCR specific limitations. The model for the project has been trained using
|
|
|
|
| 20 |
|
| 21 |
|
| 22 |
with gr.Tabs():
|
| 23 |
|
| 24 |
with gr.TabItem("Cheque Parser"):
|
| 25 |
-
gr.Markdown("This module is used to extract details filled by a bank customer from cheques. At present the model is trained to extract details like -
|
| 26 |
-
This model can be further trained to parse additional details like
|
|
|
|
|
|
|
| 27 |
with gr.Box():
|
| 28 |
gr.Markdown("**Upload Cheque**")
|
| 29 |
input_image_parse = gr.Image(type='filepath', label="Input Cheque")
|
|
@@ -34,6 +37,7 @@ with demo:
|
|
| 34 |
amt_in_words = gr.Textbox(label="Courtesy Amount")
|
| 35 |
amt_in_figures = gr.Textbox(label="Legal Amount")
|
| 36 |
cheque_date = gr.Textbox(label="Cheque Date")
|
|
|
|
| 37 |
|
| 38 |
amts_matching = gr.Checkbox(label="Legal & Courtesy Amount Matching")
|
| 39 |
stale_check = gr.Checkbox(label="Stale Cheque")
|
|
@@ -48,8 +52,8 @@ with demo:
|
|
| 48 |
[payee_name,amt_in_words,amt_in_figures,cheque_date],parse_cheque_with_donut,cache_examples=False)
|
| 49 |
|
| 50 |
|
| 51 |
-
parse_cheque.click(parse_cheque_with_donut, inputs=input_image_parse, outputs=[payee_name,amt_in_words,amt_in_figures,cheque_date,amts_matching,stale_check])
|
| 52 |
|
| 53 |
-
gr.Markdown('\n Solution built by: <a href=\"https://
|
| 54 |
|
| 55 |
-
demo.launch()
|
|
|
|
| 12 |
with demo:
|
| 13 |
|
| 14 |
gr.Markdown("# **<p align='center'>ChequeEasy: Banking with Transformers </p>**")
|
| 15 |
+
gr.Markdown("ChequeEasy is a project that aims to simplify the process of approval of cheques and making it easier for both bank officials and customers. \
|
| 16 |
+
This project leverages Donut model proposed in the paper <a href=\"https://arxiv.org/abs/2111.15664/\"> OCR-free Document Understanding Transformer </a> for the parsing of the required data from cheques." \
|
| 17 |
+
"Donut is based on a very simple transformer encoder and decoder architecture. It's main USP is that it is an OCR-free approach to Visual Document Understanding (VDU) and can perform tasks like document classification, information extraction as well as VQA. \
|
| 18 |
+
OCR based techniques come with several limitations such as requiring use of additional downstream models, lack of understanding about document structure, requiring use of hand crafted rules for information extraction,etc. \
|
| 19 |
+
Donut helps you get rid of all of these OCR specific limitations. The model for the project has been trained using a subset of this <a href=\"https://www.kaggle.com/datasets/medali1992/cheque-images/\"> kaggle dataset </a>. The original dataset contains images of cheques of 10 different banks. \
|
| 20 |
+
A filtered version of this dataset containing images of cheques from 4 banks that are more commonly found in the Indian Banking Sector was created with ground truth prepared in the format required for fine-tuning Donut. This <a href=\"https://huggingface.co/datasets/shivi/cheques_sample_data/\"> dataset </a> is available on the Hugging Face Hub for download.")
|
| 21 |
|
| 22 |
|
| 23 |
with gr.Tabs():
|
| 24 |
|
| 25 |
with gr.TabItem("Cheque Parser"):
|
| 26 |
+
gr.Markdown("This module is used to extract details filled by a bank customer from cheques. At present the model is trained to extract details like - Payee Name, Amount in words, Amount in Figures, Bank Name and Cheque Date. \
|
| 27 |
+
This model can be further trained to parse additional details like MICR Code, Cheque Number, Account Number, etc. \
|
| 28 |
+
Additionally, the app compares if the extracted legal & courtesy amount are matching which is an important check done during approval process of cheques. \
|
| 29 |
+
It also checks if the cheque is stale. A cheque is considered stale if it is presented to the bank 3 months after the date mentioned on the cheque.")
|
| 30 |
with gr.Box():
|
| 31 |
gr.Markdown("**Upload Cheque**")
|
| 32 |
input_image_parse = gr.Image(type='filepath', label="Input Cheque")
|
|
|
|
| 37 |
amt_in_words = gr.Textbox(label="Courtesy Amount")
|
| 38 |
amt_in_figures = gr.Textbox(label="Legal Amount")
|
| 39 |
cheque_date = gr.Textbox(label="Cheque Date")
|
| 40 |
+
bank_name = gr.Textbox(label="Bank Name")
|
| 41 |
|
| 42 |
amts_matching = gr.Checkbox(label="Legal & Courtesy Amount Matching")
|
| 43 |
stale_check = gr.Checkbox(label="Stale Cheque")
|
|
|
|
| 52 |
[payee_name,amt_in_words,amt_in_figures,cheque_date],parse_cheque_with_donut,cache_examples=False)
|
| 53 |
|
| 54 |
|
| 55 |
+
parse_cheque.click(parse_cheque_with_donut, inputs=input_image_parse, outputs=[payee_name,amt_in_words,amt_in_figures,bank_name,cheque_date,amts_matching,stale_check])
|
| 56 |
|
| 57 |
+
gr.Markdown('\n Solution built by: <a href=\"https://twitter.com/singhshiviii/\">Shivalika Singh</a>')
|
| 58 |
|
| 59 |
+
demo.launch()
|