Spaces:
Sleeping
Sleeping
| title: Invoice Processor | |
| emoji: 🧾 | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| app_port: 8501 | |
| pinned: false | |
| short_description: AI-powered PDF invoice data extraction and dashboard | |
| # Invoice Processor — AI-Powered PDF Data Extraction | |
| An intelligent invoice processing tool that extracts structured data from PDF invoices using **Claude AI (Anthropic)** and presents the results in an interactive dashboard. Upload one or multiple invoices and instantly get a clean breakdown of vendors, amounts, dates, taxes, and line items — no manual data entry required. | |
| ## What it does | |
| The application reads raw PDF invoices and uses a large language model to understand their content regardless of format or layout. It identifies and extracts the key fields from each invoice, normalizes them into a consistent structure, and displays everything in an interactive Streamlit dashboard with charts and export options. | |
| **Extracted fields include:** | |
| - Vendor name and contact details | |
| - Invoice number and date | |
| - Line items with descriptions, quantities, and unit prices | |
| - Subtotal, taxes (VAT/IVA), and total amount | |
| - Payment terms and due date | |
| ## Tech stack | |
| | Layer | Technology | | |
| |---|---| | |
| | UI | Streamlit | | |
| | AI extraction | Claude (Anthropic API) | | |
| | PDF parsing | pypdf | | |
| | Data visualization | Plotly | | |
| | Containerization | Docker | | |
| ## How to use | |
| 1. Upload one or more PDF invoices using the file uploader in the sidebar. | |
| 2. Click **Process** to run the AI extraction pipeline. | |
| 3. Explore the dashboard — view per-invoice details, compare vendors, and analyze spending trends. | |
| 4. Export the extracted data as CSV for further analysis. | |
| ## Configuration | |
| Requires an Anthropic API key set as a Secret in the Space settings: | |
| ``` | |
| ANTHROPIC_API_KEY=sk-ant-... # get yours at https://console.anthropic.com | |
| ``` | |
| ## Architecture | |
| ``` | |
| PDF Upload | |
| │ | |
| ▼ | |
| pypdf (text extraction) | |
| │ | |
| ▼ | |
| Claude AI (structured data extraction) | |
| │ | |
| ▼ | |
| Normalized JSON schema | |
| │ | |
| ▼ | |
| Streamlit Dashboard (charts, tables, export) | |
| ``` | |