Invoice_Extractor / README.md
gouri100's picture
Update README.md
d2c134c verified

A newer version of the Streamlit SDK is available: 1.53.1

Upgrade
metadata
title: Invoice Extractor
emoji: ๐Ÿ‘
colorFrom: yellow
colorTo: indigo
sdk: streamlit
sdk_version: 1.41.1
app_file: app.py
pinned: false

๐Ÿš€ Introducing IINDRA OCR: An AI-Powered Document Data Extraction Tool Exclusively Built for Indian Invoices ๐Ÿ‡ฎ๐Ÿ‡ณ

We are excited to announce the launch of IINDRA OCR, an advanced AI solution designed to revolutionize the way businesses extract data from invoices. This project is now publicly available on Hugging Face Spaces, providing easy access for research and development purposes.

๐Ÿ” The Problem with Existing Solutions: Current document data extraction tools struggle with the diverse and inconsistent layouts of invoices from different companies. As a result, data extraction is inaccurate and inefficient, especially when trying to convert these documents into structured formats.

๐Ÿ’ก Use Case: From small-scale industries to large retailers, businesses currently face the burden of manually extracting data from invoices and uploading it into billing software and GST filings. This manual process is time-consuming and prone to errors.

With IINDRA OCR, businesses can simply upload invoice images and download structured data in popular formats like CSV, HTML, or JSON, saving both time and reducing errors.

๐Ÿ”‘ Key Advantages:

Handles Any Layout โ€“ Whether structured or unstructured, IINDRA OCR efficiently processes all types of invoice layouts. Very Fast Inference โ€“ Optimized for rapid processing and low latency. CPU-Optimized โ€“ Designed to run seamlessly on CPU, no need for expensive hardware. โš™๏ธ Tech Stack:

Detection Model: Trained on over 200,000 invoice images to ensure accuracy and reliability. Vision Transformer: Used for advanced table structure recognition, making data extraction even more precise. Open-source TrOCR: Extracts text from images with high accuracy. Training Code: Developed using PyTorch for efficient training and performance. Programming Language: Python Deployment & Hosting: Hosted on Hugging Face for easy access and deployment. ๐Ÿ”— Try IINDRA OCR Today! Take advantage of this open-source tool for your research or business use. Visit [Hugging Face Space Link] to explore and integrate the solution into your workflow.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference